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This post was originally published on the Mapzen 
Weblog https://mapzen.com/blog/iamhere/ , in February 2016 . 


tl;dr 


https://whosonfirst.mapzen.com/iamhere/ https://whosonfirst.mapzen.com 
/iamhere/ is a shiny new version of Simon 

Willison's http://biog.simonwiiiison.net/ classic "Get Lat Lon" application full 
of Mapzen-y goodness https : //github. com/whosonf irst/whosonf irst-www- 
iamhere . 


A short history 

In late 2007 Simon Willison http://biog.simonwiiiison.net/ launched 
what some people have described as "the most useful website on the internet". The 
website was called Get Lat 

Lon https : //web . archive . org/web/2007 10 1300 11 13/http: //getlatlon . com/ and 
its entire purpose was to enable a visitor to "find the latitude and longitude of a point on a 
map". 


The website was built using the Google Maps 
API https://deveiopers.googie.com/maps/ and had a form for geocoding 
addresses or place names but the primary interface was a simple map with a set of 
crosshairs centered in the viewport. Get Lat Lon would simply print the geographic 
coordinates of whatever location happened to be beneath the crosshairs. Brilliant! 


Somewhere between 2007 and now the domain renewal for Get Lat Lon lapsed 
and now it's... something else entirely, something not worth linking to. You can still get a 
feeling for the simplicity and elegance of its overall design because there are snapshots of 
the website in the Wayback Machine https://web.archive.org/ except... none of 
the Javascript works anymore. 




http://www.getlatlon.com/ 

111 captures 

13 Oct 07- 15 Jan 13 


Get Lat Lon 


Co 


Find the latitude and longitude of a point on a map. 
Place name: Zoom to place 


+ 


Latitude, Longitude: 

Google Maps zoom level: 

Built by Simon Willison 


In 2009 I decided to write my own version of Get Lat Lon. Instead of using the 
Google Maps API it would use all "open" software and data. The map data would be 
from OpenStreetMap http : / /www . openstreetmap . org / . The map tiles would be 

from CloudMade using exciting new cartography from Stamen 

Design http://mike.teczno.com/notes/cloudmade-styles.html . It Would use the 
modestmaps.js https : //github. com/stamen/modestmaps- js library for managing 
all those tiles. It would support the then still-nascent browser-based Geolocation 
API http://www.w3.org/TR/geoiocation-APi/ to help determine your location. 
The geocoding would be handled by 

Flickr https : / /www. f lickr . com/ services /api/flickr .places . find.html and in 
addition to geocoding it would also try to reverse geocode your location and display the 
shape of the place http : //code. f lickr . net /2 00 8/ 10/30 /the -shape -of -alpha/ 
contained by a latlon, again using the Flickr 

API https : //www. flickr. com/ services /api/f lickr .places . f indByLatLon. html 



jbaghdad iraq 


J | ^ riND THIS PLACE | | or find my location | 


33.31 570000000001 ,44.3921 99999999995 

Baghaad, Baghdad, Iraq (WOE ID 1979455) 


And... it would be so clever and modular https://github.com/straup/js- 
iamheremap that it would support multiple service providers and you could just drop it 
in to any webpage and it would work, as if like magic. It was called "I Am Here" and I 
think I was the only person to ever use it but it's still 

running http: //www.aaronland. info/iamhere/ . In 2014, though, CloudMade got 
Out of the tile business https : / /wiki . openstreetmap . org/wiki/CloudMade and SO 
there is literally not much to see anymore. 





I am pretty sure that it's exactly one line of code to define a new map 
provider https : //github . com/straup/ js- 

iamheremap/blob/master/iamheremap . src . j s#L2 68-L2 80 to make I Am Here work 
again hut to be perfectly honest just looking at all that too-too clever code now, in 2016 , 
is exhausting . Also, see the way the licensing information on the map data hasn't been 
updated to reflect the switch to the 

ODbL http : //www. openstreetmap. or g/ copy right ... 

Reverse geocoding 

Fast forward to last year (2015) and work has begun in earnest on the Who's On 
First (WOF) gazetteer https://whosonfirst.mapzen.com/ at Mapzen. Part of that 
work has been to build hierarchies for each record in the gazetteer which is something of 
a chicken-and-egg problem https : //github . com/whosonf irst/whosonf irst- 
piacetypes#here-is-a-pretty-picture . We've been automating the process with a 



general purpose point-in-polygon tool https://github.com/whosonfirst/go- 
whosonfirst-pip/ that we've written in-house using the Go programming language. It 
is called go-whosonf irst-pip and it works like this: 

• It loads an arbitrary number ofWOF documents in to memory. Where 

" WOF document " just means any Geo J SON document with a properties 
dictionary containing id, name and placetype keys. 

• The documents are indexed using an R- 

tree https://en.wikipedia.org/wiki/R-tree . All R-tree is U data 
structure optimized for storing multi-dimensional information, like 
geometries. 


• The document set is queried with a latitude and a longitude and optionally 
filtered by placetype. 

• Because the R-tree stores bounding boxes instead of complex geometries 
(as Wikipedia says: The "R " in R-tree is for rectangle) there is a final 
operation ( called 

raycastillg https://en.wikipedia.org/wiki/Ray_casting ) to ensure 
that a point is actually contained by any of the candidate results. 


That's it. The purpose of the go-whosonf irst-pip code is to do fiddly math 
across a large and heterogenous dataset as quickly as possible. 




This is what an R-tree looks like, courtesy Wikimedia user 
Chire https : //commons .wikimedia. org/wiki/User : Chire 


It only knows about points and things that contains those points but it does not 
know about context. For example, consider the following question: What continent is 
Russia https : //whosonf irst .mapzen. com/spelunker/id/85632685/ a part of? 
Europe? Asia? All of the above? There are lots of interesting applications that remain to 
be built on top of go-whosonf irst-pip but it is important to remember that it is not 
an inference engine, by design. 


The code includes a simple HTTP server (called wof -pip- server) that you 
can use to easily load (and then query) one or more "meta" 

files https : //github . com/whosonf irst/whosonf irst-data/tree/master/meta 
containing pointers to different WOF documents. If a WOF document is just a GeoJSON 


with a few explicit properties then a "meta" file is just a CSV with a path column 
containing a relative path to a WOF document. 


Although the "meta" files were originally conceived as little more than a simple 
helper tool (or index) for large volumes of data they have grown in to something of a 
"first class" object inside the world of Who's On First, with more and more of the tooling 
and infrastructure built around them. They are due for a longer more detailed discussion 
but not today. 


To get started with an instance of wof -pip- server that will query for 
countries and neighbourhoods you would do: 


$> . /bin/wof-pip-server -data /usr/local/mapzen/whosonf irst-data/data/ \ 

/ usr/local/mapzen/whosonf irst— dat a /met a/wof— country— latest . csv \ 
/usr/local/mapzen/whosonf irst-data/meta/wof-neighbourhood-latest • csv 
[placetype] country 219 
[placetype] neighbourhood 49906 


Depending on how fast your computer is the indexing process might take a 
couple of minutes. By default the wof-pip-server listens for requests on port 8080 
on your computer's local "loopback" network interface which is also called localhost, 
so the URL for querying the server would be http: / /localhost : 8080. For 
example: 


$> curl ' http: / /localhost: 8080?latitude=40 . 677524&longitude=-73 . 987343 ' 
[ 

{ 

"Id": 102061079, 

"Name": "Gowanus Heights", 

"Placetype": "neighbourhood" 

}, 

{ 

"Id": 85633793, 

"Name": "United States", 

"Placetype": "country" 

}, 

{ 

"Id": 85865587, 

" Name " : " Gowanus " , 

"Placetype": "neighbourhood" 

> 

i 


If you want to limit the result set to a specific placetype simply append 
placetype=PLACETYPE to your query string, like this: 



$> curl ' http: //localhost : 8080?latitude=40 . 677524&longitude=-73 . 987343&placetype=neighbourhood ' 
[ 

{ 

"Id": 102061079, 

"Name": "Gowanus Heights", 

"Placetype": "neighbourhood" 

}, 

{ 

"Id": 85865587, 

" Name " : " Gowanus " , 

"Placetype": "neighbourhood" 

> 


Currently it is not possible https : //github . com/whosonf irst/go- 
whosonf irst-pip/issues /22 to filter the result set with multiple placetypes. That's not 
technically a bug but it's become clear that it's also not a feature. 


The wof -pip- server returns as little information as possible because it stores 
as little information as possible, mostly for performance reasons. It is left up to 
applications using wof -pip- server to decide whether and how to look up more 
information about any given WOF document. 


The nice thing about the go-whosonf irst-pip tools is that they are designed 
to be agnostic as possible about the data they index and serve. For example I recently 
downloaded version 2.0.1 of the Flickr Alpha Shape 

files https : //archive . org/details/FlickrShapesPublicDataset2 .0.1. tar and 
re -jiggled the hie structure (but not the actual data) of the alpha shapes and now they will 
Just Work™ with wof -pip- server. 


whosonfirst-www-iamhere 

So far, so good. We have an enormous bag of (Who's On First) 
data https://github.com/whosonfirst/whosonfirst-data/ and we have a tool 
for establishing the relationship(s) between those 

files https://github.com/whosonfirst/go-whosonfirst-pip/ but any volume of 
geographic data absent a map is... hard to see. 




I search for a place | search | 


the map is centered at 40.676277,-73.988371 at zoom level 13 which appears to be somewhe'e in Gowanus or 
Gowanus Heights 


So, I rebuilt I Am Here (or Get Lat Lon) ... again. It's called whosonfirst-www- 
iamhere https://github.com/whosonfirst/whosonfirst-www-iamhere or just I 
Am Here (again) for short. 


It does everything that the combination of Get Lat Lon and the original I Am 
Here did, but is built entirely using Mapzen tools and services. 


Aside from an ongoing need to simply know what the coordinates are for any 
given spot on a map it seemed like whosonf irst-www-iamhere would be a good 
and useful tool for visualizing and sanity-checking the results returned by the go- 
whosonf irst-pip code. 



search for a place search •$- 


the map is centered at 37.751376,-122.469149 at zoom level 14 which appears to be somewnere in Golden Gate 
Heights or Golden Gateheights 


For example, Golden Gate. ..what? 

This time, though, it's been built with two guiding principles in mind. The first is 
that "Mapzen should always be Consumer Zero (of Mapzen services)" and the second is 
to minimize the pain and nuisance of any one piece, of what is actually a pretty complex 
application, failing or shutting down or otherwise going offline. 

Mapzen as Consumer Zero 

The latest version of I Am Here uses a bunch of Mapzen services already: 

• It uses the Tangram https : //github. com/tangrams/tangram 
Javascript library for rendering tiles and 

Refill https://github.com/tangrams/refill-style for Styling them. 

• It uses Mapzen Search https://mapzen.com/projects/search ,for 
geocoding. This is not enabled by default as you'll need to sign up for an 


API key https : //mapzen . com/developers ( it's really easy!) and add it 
to the mapzen .whosonf ir st . config. js configfile. 

• It uses Who's On First data https : / /whosonf irst .mapzen. com/data/ 
for reverse geocoding and for displaying geometries and other metadata 
about a place. 

In time, it will also use: 


• Mapzen Tlirn-by-Tliril https://mapzen.com/projects/valhalla to 
visualize the placetypes along a journey route. There is a branch of the go- 
whoson first-pip code https : //github. com/whosonf irst/go- 
whosonf irst-pip/tree/polyline that will perform point-in-poly gon tests 
along an encoded 

polyline https : / /developers .google . com/maps /documentation/utilit 
ies/polylinealgorithm as returned by the Turn-by-Turn service so this 
feature is mostly waiting on user-interface details rather than number- 
crunching. 

• A Who's On First enabled IP lookup 

Service https://whosonfirst.mapzen.com/mmdb/ to help determine 
where to position the map when I Am Here is launched. 


Small pieces, loosely failing 

The ultimate goal of whosonf irst-www-iamhere is to work from your own 
computer in offline-mode (or when you don't have a network connection) without 
needing to download and install a long list of dependencies. As of this writing: 


• It does not come with pre-installed WOF data , but that's also a deliberate 
choice. We'll talk more about that in a moment. 

• It does not come with pre-installed or cached vector tiles ( used by 
Tangram). There is a margins-of-the-day project to enable tile-caching in 



Tangram https : //github. com/thisisaaronland/tangram/tree/localfor 
age but it is not ready for general usage yet. 

• It does not come with a pre-installed version ofMapzen Search for offline 
use. Since the code that powers the sendee ( Pelias) is open source and 
designed to be run by an individual that piece is left as an exercise to the 
user. 

• It does come with pre-installed platform specific ( OS X, Linux and Windows) 
binary applications for serving the Tangram Javascript code and the wof- 
pip-server endpoint. 

• It does come with a handy "startup" tool meant to take care all of the details 
when launching whosonfirst-www-iamhere. Currently the tool is 
written in Python which comes pre-installed on OS X and Linux computers. 
For people using Windows computers installing Python fails the yet- 
another- dependency test so there is added impetus for rewriting it (the 
startup tool) as something can be pre-compiled to run on a Windows 
machine, too. 

• It does not come with a handy GUI startup tool. Currently it assumes a level 
of familiarity with your computer's command-line (or terminal) interface. 

Here is an example of how you might start whosonfirst-www-iamhere 
from your computer. This assumes that you have downloaded (or cloned) the 
whosonfirst-www-iamhere https : //github . com/whosonf irst/whosonf irst-www- 
iamhere code and have navigated in to the root directory. 

$> . /bin/ start .py -d /path/to/your/whosonfirst-data/data \ 

/path/ to/your /whosonf irst-data/meta/wof -neighbourhood- latest . csv \ 

/path/to/your/whosonf irst-data/meta/wof- locality-latest .csv 




Who's On First | Mapzen 


Baghdad search 


the map >s centered si 33.332823,44.426651 a: zoom level 11 which appears to be somewhere in Baghdad 


The short version is that once the start . py script has finished setting 
everything up you can open your web browser up at http: / /localhost : 800 1 and 
start poking around countries and neighbourhoods from Who's On First. 


The longer version follows. By default the start . py tool requires a minimum 
of two arguments. The first (-d) is the path to where, on your computer, you've stored 
your raw Who's On First data files. The second and third arguments are the paths to 
"meta" hies (remember them?) that the wof-pip-server will index. The start . py 
tool will start three separate servers running on your computer: 


• A simple web server that will the Who's On First data you specified (with the 
-d parameter) running on port 9999. Remember the way we said that 
wof-pip-server returns little more than a WOF ID? The reason for this 
data-only server is that the whosonfirst-www-iamhere application 
will "inflate" a WOF ID, returned by the PIP server, by fetching its record 
from the data server. 


• Another simple web server that will host the whosonfirst-www- 
iamhere application running on port 8001. These two pieces could be 
served by a single more sophisticated web server. 

• Finally the wo f -pip-server itself running on port 8080. 

All of these port numbers can be changed if necessary. To do so you would pass 
your own setting as parameters to the start . py tools and as custom settings in the 
mapzen. whosonf irst . conf ig . j s conhg hie. 



the map is centered at 42.358430,-7 1 .059770 at zoom levB 1 1 which appears to be somewhere in Boston 


Bundles 

One of the challenges with Who's On First has been balancing our desire for a 
robust and portable data format (plain text GeoJSON hies), the needs for an historical 
audit trail and the mechanics of working with and distributing a large and ever-growing 
dataset. We have been using Git and GitHub extensively for much of the work to date but 
as the commit history around the data grows so too does the size of the 


whosonf irst-data repository and the burden in using it or simply getting started 
with Who's On First. 


Montreal search 




Mexco C*y 


Leaflet | Tangram | © OSM contributors | Who's On First | Mapzen 


the map is centered a: 45.398450,-73.476562 at zoom level 3 which ao pears to be somewhere in Quebec 


As an alternative we have been working on something called "bundles" . Bundles 
are: 


...a collection of GeoJSON formatted files (Who's On First data) grouped by 
a specific property, like placetype. They allow for people to more easily bulk 
download a subset of the entire Who's On First dataset. Currently there are 
only bundles by placetype but eventually we will add a variety of different 
"slices" of the data as demand and interest require. 


Each bundle contains a "meta" hie (see... they just keep popping up all over the 
place!) and a folder named data which contains the hies listed in the meta hie. Bundles 
do not contain any Git history or related metadata but our hunch is that many people 


don't need or want that information. The startup tool mentioned above does not yet have 
support for bundles but that will happen shortly. In the meantime you can get started with 
whosonf irst-www-iamhere and bundles with a few short commands in your 
terminal. 


For example, if you just wanted to run a copy of whosonf irst-www- 
iamhere using only 

microhoods https : //whosonf irst .mapzen . com/spelunker/placetypes /microhood/ 
(which are currently all in San 

Francisco https : //whosonf irst .mapzen . com/spelunker/id/85922583/descendants 
/?&placetype=microhood ) you would do the following: 


$> cd /path/to/your /whosonf irst-www-iamhere 

$> curl -O https://whosonfirst.mapzen.com/bundles/wof-microhood-latest-bundle.tar.bz2 
$> tar -xv jf wof-microhood-latest-bundle. tar .bz2 

$> . /bin/ start .py -d wof -microhood-latest-bundle/data wof -microhood-latest-bundle/wof -microhood-latest . csv 



search for a place search 


the map s centered at 37.796356,-122.396107 at zoom tevel 14 which appears to be somewhere in Super Bowl City 


All of the details and currently available bundles are listed over at 
https://wh0s0nfirst.mapzen.com/bundles https://whosonfirst.mapzen.com/bundles 
and... yes, Super Bowl 


City https://whosonfirst.mapzen.com/spelunker/id/42 0561633/ is a thing that 
really happened in 2016. 


Or you can just use our version 

As mentioned at the beginning of this blog post there is a publicly accessible 
version of I Am Here for you to play with at 

https://whoSOnfirst.mapzen.com/iamhere/ https : //whosonfirst.mapzen. com/iamher< 

/ . 


Right now it only display neighbourhoods but shortly we will add the ability to 
select different (even multiple) placetypes to display at the same 
time https : //github. com/whosonf irst/go-whosonf irst-pip/issues/22 . And as 
circumstance permits we will add the additional features (routing and IP lookups) 
mentioned above. And then all the stuff we haven't even thought of yet. 


Enjoy! 


2016-02-19 
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the pendulum of bespokiness 


Aaron Straup Cope / Bosch Connected Experience March 2016 


Last week, I had the pleasure of speaking at the Bosch 
Connected Experience https : / /www . bosch- 
si.com/lp/experience/bcx-agenda/agenda.html event, in 
Berlin. I spoke about the 

Pen https://collection.cooperhewitt.org/pen , just in case 
you thought there wasn't anything left to say about it. 


This is what I said. 





Hello, my name is Aaron. These days I make 
maps https://whosonfirst.mapzen.com but between 2012 and 
2015 I was Head of Engineering and part of the Digital and 
Emerging Media https://iabs.cooperhewitt.org/ team at 
the Cooper Hewitt Smithsonian Design 
Museum https : / /cooperhewitt . or g/ , in New York City. 

Our job was to help re-imagine what it means for a museum, 
in the twenty-teens, to be a full citizen on and of the internet and vice 



versa. Some of that work is what I'll talk to you about today. 


The title of this talk — The Pendulum of Bespokiness — is 
my reaction to the idea, the Zeitgeist, that all devices and all their 
disparate functionalities are converging. Typically towards people's 
phones and more generally some kind of "hub" . 


I am not going to claim that it isn't happening but I would 
like to suggest that — however subtly or however gently — this 
trend has begin to swing the other way. This is the lens that I would 
like to use to look at the work we did at the Cooper Hewitt. 



http://shlong.us/g2 

20K words of (ongoing) theory and praxis 


There is no way that I will be able to convey the complexity 
and the detail of all the work we did in a single talk. In addition to all 
the work that we on the Digital team did the museum was also closed 
for an expansion and an historic restoration and had plans to re-open 
with upwards of ten new exhibitions, simultaneously. 


By my estimate something on the order of twenty to twenty- 
five thousand words have been written by and about the work we did 
at the museum, notably the Pen which I will come back to shortly. I 



have compiled an appendix with links to the relevant 

writings http: //aaronland. info/talks/ 2016 /bcxl 6 /appendix.ht 

ml and it can be found at this URL. 

I will show this link again a few more times during the talk. 



I'd like to start with a pretty benign diagram, one that borders 
on the banal. The intersection of these three forces is the territory 
where most human endeavour happens. 

The near-future is the place that we are all, in our separate 
projects, working towards. The details may change but the ambition 
remains the same. It is the horizon of the almost-reachable and 
nearly-there possibilities that acquire a kind of hyper-clarity born of 




past experience (usually failure). The present is the uneasy terrain 
between those two poles. 

The present is also the place where the boundaries between 
one's own past and near-future are reimagined, sometimes redefined 
beyond recoginition, by other people. 



delusion 


/\ 

trauma despair 


So this often is what that diagram looks like, in practice. 




Here's a more concrete example. 


In 2016 we more or less have a handle on building large, 
networked software systems. It's not perfect but neither the heavy- 
lifting or the economics of these kinds of projects lend themselves to 
immediate dread or panic anymore . 


In 2016 we do not enjoy an equivalent facility with hardware, 
specifically electronics, but the costs are shrinking rapidly. Every 




couple of years seems to knock a zero off the total cost of hardware 
development so that's progress. 


In 2016 industrial design, tooling, manufacturing and the 
complex interplay and demands of material properties — a process 
sometimes refered to as "design for manufacture" or DFM but what 
Bryan Boyer once aptly described as "matter 

battle http : / / etc . of thiswearesure . com/2 0 11/01 /matter_battle 
/ remains the place where everyone still goes to be sad. 



software 
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In fairness to the experiences of the people in this room that 
last slide could be reconfigured to look like this but even here DFM 
is the meat-grinder of reality where all projects go to meet an 
uncertain fate. 


This was certainly our experience at the museum, anyway. 





Because we made a digital pen, from scratch. 


We made a pen that is actually an NFC-enabled capacitive 
stylus. It allows people to collect objects on display in the museum 
by tapping the pen to an object label, which itself contains an 
embedded NFC tag, and recording that object's unique ID. 


The NFC antenna is activated when a button housed in the 
rear of the pen is pressed. The pen contains a circuit board with a 


small amount of on-board storage and an equally small 
microcontroller for recording the things the antenna sees. 


The pen's only feedback mechanisms are a gentle vibrator 
and three small monochromatic lights. It is designed for 21 to 28 
consecutive days of use on the floor before its three AAA batteries 
need to be replaced. 




Every visitor to the museum is a given a pen that is then 
paired with their ticket which contains a unique short code (or URL). 
That code can be used to retrieve its corresponding visit online. A 
person might look at their visit's online doppleganger while they are 
still in the museum galleries or from home, six months later. Or 

both http: //labs . cooperhewitt . org/2 016/a-very-happy-open- 
birthday-for-the-pen/ 





Museums have the luxury of being patient that way, which is 
kind of a big deal. 




The Pen (no longer simply just a "pen") was originally 
conceived of as a device — literal and metaphorical — to re-imagine 
the museum visit as something other than the usual experience where 
people march from one object to the next passively receiving the 
wisdom of the experts. 


The Pen is meant to give visitors permission to play in the 
galleries. To give them permission to touch and in some cases to 
write on or affect maybe not the actual objects on display but 


certainly their digital representations. Throughout the museum there 
are a number of large multi-user interactive tables , each with 
custom- written applications, where you can browse more 
information about the objects on display as well as related objects 
from the collection. That's pretty important for a museum like the 
Cooper Hewitt which has two hundred thousand objects (or the 
larger Smithsonian which has 137 

million http://dashboard.si.edu/ objects) but only enough 
physical space to display about 700 of them at a time. 


These applications also allow visitors to create, and save, 
their own designs. This is a photograph of the Immersion 
Room http : / /www . newyorker . com/culture/culture-desk/new- 
cooper-hewitt , probably the most popular part of the museum 
since the re-opening. The Immersion Room consists of an interactive 
table and a floor-to-ceiling projection where visitors can see the 
museum's vast collection of wall- 
coverings https : //collection. cooperhewitt . org/departments 
/ 35347503 / in a way that had never been possible before. They 
can also create their own wallpaper designs , drawing on the tables 
with the Pen or their fingers, and see them projected in real-time and 
at scale and then, as I've mentioned, save them to their online visit. 
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As much as the Pen was designed as a way to foster activity 
in the galleries — and dwell times have nearly doubled, sometimes 
tripled, since the re-opening — it was also explicitly designed as a 
way to extend a person's visit beyond the museum. So far it's worked 
pretty well. 


The Pen, which was launched a year ago tomorrow, has been 
used by visitors to collect objects four million times. It enjoys 
conversion rates that would make Silicon Valley companies weep. I 



don't have the current numbers but when I was still at the museum 
we saw conversion rates of 20 and 30 percent, respectively, for 
people looking at their online visit within the first two weeks and 
then creating a Cooper Hewitt account. 


The other important number is that visitors have collected 
something in the neighbourhood of 5, 000 unique objects. Remember 
how I mentioned that the museum only has space to display 700 
objects at a time in the galleries? The goal of anyone who says 
"museum" and "digital" in the same breath has always been to find 
meaningful bridges between the analog and digital and the Pen 
appears to be doing something right in that regard. 



http://shlong.us/g2 

a catalog of war stories 


Naturally when we started with the Pen concept in 2012 we 
assumed that a device such as we were imagining must already exist. 


Naturally it did not, or rather if it did exist it existed to 
operate only under conditions and circumstances entirely unsuited 
for the realities of a busy museum with a heterogenous audience. 


Naturally we found ourselves doing full-on product design, 
with a lot of help from friends and partners. Product design is not 



only outside the comfort zone of museums, it is also outside our 
experience. 


For a design museum though it was a good and healthy albeit 
challenging experience. For a design museum that celebrates process 
there is value in reminding oneself that things don't just magically 
transition from napkin sketch to finished product. 


What follows is a list of the people involved in the project 
and a few of the "matter battle" highlights, starting with the finished 
Pen and working our way down the stack. 




MakeSimply http://makesimp.iy/ is a manufacturing 
company and broker based out of Taiwan and the US . They did all 
the mechnical engineering, tooling, assembly and herding of cats 
involved in any project that requires twelve separate factories. 


The most complicated parts of the Pen are at the ends, but the 
center is pretty straightforward. It's a tube in to which you slide a 
circuit-board and three batteries. The capacitive sheathing is applied 
using a process called overmolding where one material (the 




capacitive sheathing) is heated to the point of being viscous and then 
sprayed on to another material (the body of the pen) and left to 
harden in place. 


So far so good, but then we got a call from MakeSimply one 
day. They told us that the overmolding was melting the then-plastic 
body of the Pen. The only solution was to use an aluminum body 
which caused the electronics people to have a little freak-out. More 
about that in a minute. 



MakeSimply 


✓/✓/ 


General Electric 


General Electric https://twitter.com/gedesign (GE) 
did all the industrial design for the Pen, including early prototype 
designs and all the finished UX work for the lights and haptic 
feedback on the Pen. 


GE were great and they were very clear and upfront with us 
early on that they would not do DFM. It's not like GE can't do DFM 
but they are set up to do DFM for things like submarines and wind 




turbines not small, bespoke projects like ours. That was a learning 
experience for the museum. 



MakeSimply 


✓/✓/ 


General Electric 


Sistel Networks http: //sistelnetworks .com/ did all of 
the electronics and NFC-wrangling for the Pen and the antenna 
boards for shuttling information on and off the Pen. 


They came up with the initial design concept for the button- 
activated radio which was instrumental allowing us to have a device 
that can go uncharged for three to four weeks . The Cooper Hewitt 
has, even by museum standards, a tiny staff. There are only 70 to 75 




people across the entire institution so changing batteries, or simply 
recharging devices, every day was a non-starter. 


To put the work that Sistel did to acheive this in perspective 
the circuit boards in the Pen don't even have a clock since that would 
have added a prohibitive drain on power. 



MakeSimply 


✓/✓/ 


General Electric 


Ideum http : / / ideum . com/ designed and built all of the 
interactive tables, which came in three flavours: 32-inch tables for 
single -person use, 55-inch tables for small groups and 84-inches for 
larger groups and as a way for other people to learn how the tables 
worked simply by watching other people use them. Each one of the 
tables has one to eight built-in NFC antennae for communicating 
with the Pens. 




Ideum builds things to withstand the continued abuse of 
small and young children in addition to absent-minded adults, in a 
museum setting, but sometimes you still have to create the future 
you want to use. 

Which is to say that in the summer of 2014 the 84-inch tables 
were still a work-in- progress. An unfinished piece of hardware that 
the software developers had to accommodate and adjust to on a 
weekly basis. A thing that needed to be assembled before all the 
NFC hardware had been finalized. A thing that couldn't be moved 
out of the museum once it was determined that adjustments needed 
to be made because we had already started installing objects from the 
collection in the galleries. 




MakeSimpty 

✓/✓/ 

General Electric 


Slstel 


Ideum 

^ 4 

Diller Scofidio + Renfro 


Diller Scofidio + Renfro http: //www. dsrny.com/ 
designed all the case-work in the museum. They are beautiful cases 
but one of the challenges they presented the museum with is that 
they are made of solid metal. Metal is, of course, the perfect enemy 
of all radio signals including NFC. 


I'd like to take a moment to thank the entire country of 
Germany because apparently you are the only people who produce 
ferrite shielding, a material that can be used to dampen the 




interference between metal and radio signals . If you noticed a 
shortage of ferrite shielding in late 2014, or early 2015, that was 
probably the museum buying as much of the stuff as it could get its 
hands on. 




MakeSimpty 

✓/✓/ 

General Electric 

__ — 

Slstel 



■ 5 M T 

“■ Tellart 

Ideum 

& wSm 4 

^ Diller Scofldlo + Renfro 


Tellart http: //aaronland. inf o/ talks/ 2 016/bcxl6/appe 
ndix . html designed the electronics for the Pen registration 
stations, used by the visitor services (or "front of house") staff for 
pairing a Pen with a visitor's ticket. 

There is a whole other talk about just this 
piece http: //www. aaronland. info/weblog/2015/12/31/belief / 
#de sign- eagle , so I will simply say that the requirement for the 
registration stations is to allow the front of house staff to complete a 




ticket purchase and pair a pen for visitors, many of whom don't know 
anything about the Pen or who then want to have a conversation 
about it, in under 30 seconds. 




Local Projects http://iocaiprojects.net/ designed 
and wrote all of the software for the interactive tables . 


There aren't any specific matter battle stories for Local 
Projects. They were more about the software side of things than the 
actual hardware. That also means they were on the receiving end, 
with fixed-date deliverables and their reputation on the line, of 
everyone else's hardware challenges. If product design is outside a 
museum's confort zone the way the Pen came together and the way it 




forced everyone to work together is equally outside the comfort zone 
of many client-services design studios. 



MakeSimply 





General Electric 


Sistel 


Tellart 

Ideum 

1 Diller Scofldio + 

a a 

Local Projects 

4^ 

Cooper Hewitt 


Finally, the Cooper Hewitt. In 2012, when the Digital and 
Emerging Media team was created, we took all museum related 
software development in- 

house http : //www. aaronland . info/weblog/2 012 /11/09/jello/# 
paraiiei-tms . The net-result of this process was to say to every 
third-party who wanted to do anything with Cooper Hewitt related 
data that they would do so using the museum's own API. 




We, not yet-another custom black-box database, would the 
single source of truth for all things museum-related. The building, by 
way of the interactive tables and the Pen, is now the single largest 
consumer of the museum's own 

API https : //collection. cooperhewitt . org/api 



MakeSimply 



Moving away from the hardware side of things, 
Undercurrent http : / /www . undercurrent . com/ , the now sadly- 
defunct strategy hrm in New York City, lent us Jordan 
Husney https://twitter.com/jrhusney to help oversee and 
manage all the moving pieces in getting the Pen out the door. 




MakeSimply 



The Cooper Hewitt was its own general contractor, managing 
and buffering and sometimes cajoling all the pieces in the same 
direction. Not shown here are the even more complex layers of pain 
required to do budgeting and contracts inside the federal 
government. The Cooper Hewitt is part of the Smithsonian which is 
a "trust instrumentality" of the US government so it is bound and 
subject to everything you've ever heard about government 
contracting and purchasing agreements. That was fun. 





And this is how it all fit together. There is also a full-sized 4K 
version of this 

diagram http: //aaronland. info/weblog/2016/03/09/osha/ima 
ges/bcx!6-pen-all .png if you're curious. 


It is important to understand that most of the actual work you 
see described here happened in one very compressed, very manic 
and very difficult twelve month period between February 2014 and 
February 2015. Planning and design meetings, as well as the digital 




infrastructure work that the Cooper Hewitt itself was responsible for, 
had been underway since 2012 but the Pen proper was born in a 
crucible of bad craziness where each partner was forced to adapt and 
adjust to the demands and changes imposed by every other partner, 
all at the same time. 

Almost no one involved with the Pen was happy for most of 
that year. 




The bad news is that no one should ever try to do things the 
way we did. We did things this way out of necessity but ours should 
not be the template for how to do product design. 

On the other hand, the good news is that it is totally possible 
to imagine, design and manufacture something like the Pen inside of 
a year which is pretty exciting. It's still hard and still not cheap (but 
actually a lot less expensive than you might imagine) but it is 
possible. 




So, why did we do it in the first place? 



THIS IS 



We did it because of this. This is the slide that set the whole 
project in motion, back in mid- 20 12. The museum had given its 
exhibition designers license to imagine what an interactive museum 
experience might look like beyond simply another kiosk or mobile 
"app". 


This is what they came back with. They came back with this 
idea of giving every visitor an interactive pen as a way to invite them 
(and to challenge the museum) to think of the museum as a 




participatory 

experience http: //WWW. participatorymuseum.org/ 


The Cooper Hewitt is a design museum and, for a museum 
anyway, design is as much about the process as it is about the 
finished object so the idea of an object (the Pen) that embodied this 
principle was fascinating. 




But as intrigued as we were by the possibilities of a proactive 
device it was the other affordances that the Pen suggested which 
really spurred us on. 


What if it were possible for someone to visit the museum and 
have a record of that experience without, and this is important, 
having to spend all of their time actively engaged with technology in 
order to do that? 




What if a person could simply reach out and touch an object, 
or its label, to collect it? It's pretty easy to imagine that possibility 
but it turns out the Pen is still, in 2016, the minimum required 
hardware in order to actually do that outside of burning away the 
details with an irresponsible amount of money. 


Along the way many people asked us why we didn't just 
create a mobile app to record a person's visit. There are many 
reasons but even if the only reason had been cost (the cost of 
developing for multiple platforms and then the ongoing maintenance 
costs) we still would have said "No" . 


We would have said no because we wanted to ensure, to 
preserve, what we spoke of internally as a "heads up" visit. The point 
of visiting the museum is to allow people an opportunity to focus on 
the collection, not the technology. 


If the technology only serves to delegate the collection to a 
supporting-cast role then perhaps there is a larger problem with the 
entire museum project itself. 




The Pen does as little as possible because you are not meant 
to be dazzled by the technology, at least not the second time you visit 
the museum. 

You are meant to be able to take the Pen for granted because, 
ultimately, you are meant to be able to take for granted what the Pen 
makes possible: The luxury and the confidence of recall (your 
visit) at the time of your own 




choosing, http: //www.aaronland. info/weblog/2014/09/ll/bra 
nd/#dconstruct-016 

The Pen is not a device meant to surpise you. We leave that 
to the collection https : / /collection . cooperhewitt . org/ 



We live in a world with enough technological surprise as it is. 

In 2016 the cognitive overhead of not simply understanding 
what an object does but what an object might do, particularly when it 
is connected to the internet is overwhelming, on good days. On bad 
days it can feel like a betrayal. By way of example let's just pick a 
few recent headlines from the news: 




The Nest thermostat and its failed software update over 
the holidays that left a number of people without 
heating, in the middle of winter. Bugs happen and even 
hardware fails . I once spent nine hours without power 
in minus 30 degree weather when a single transformer 
in Northern Quebec exploded one night and plunged 
three quarters of the province in to darkness. What I 
find striking about the way people talk about the Nest 
incident is that they are starting to 

question http: //www.nytimes .com/2016/ 01/ 14 /fash 
ion /nest- thermos tat- glitch-batt ery-dies- 
sof tware-f reeze . html what the point is, what the 
benefits are, of making something as simple and 
useful https : //collection . cooperhewitt . org/type 
s/35294933/ as a thermostat "smarter " . 

Meanwhile , the head of the intelligence services in the 
United States has said in no uncertain terms that they 
will use the "internet of things " as a platform for 
spying on 

people http: //www.theguardian.com/technology/2 
016/ f eb/09/int ernet-of- things -smart-home- 
devices -government- surveillance- james- 
ciapper . Everything you here in this room are 
working on will be repurposed, wholesale, as 
surveillance 



pipes http: //idlewords . com/ talks /our_comrade_th 
e_electron.htm . 

• Finally, and I apologize in advance if this is a trigger 
word for some people in the room, there is Volkwagen's 

so-called " defeat device " 

software https : //en .wikipedia. org/wiki/Defeat_de 
vice that is deliberately programmed to lie about a 
car's fuel emmissions. If the other examples are the 
domain of nerds and people wearing tin-foil hats then 
the VW fiasco is the place where everyone else is 
looking at the evidence and starting to ask just what the 
fuck is going on? It is difficult not to see the "defeat 
device" as a slap in the face to the entire notion of a 
social 

contract https : //en .wikipedia. or g /wiki/ Social_c 
ontract 


Trying to find your bearings, trying to make sense out of all 
these crosscurrents is dizzying and I don't have a pat answer for any 
of it. I would like to finish, though, by arguing that these ambiguities 
are what constitute the "present" for the internet of things. 


The lack of clarity around, and in some cases the hijacking 
of, the reasonable expectations of action and reaction in an internet 
of things is what is and what will shape and redefine the relationship 



between the past and near-futures of the internet of things you have 
imagined for so long. 

This is the work. 



http://shlong.us/g2 

http://aaronland.info/talks/201 6/bcxl 6/appendix.html 


Thank you. 




2016 - 03-09 



name 


chi x preferred 


IB&lU 


chl_x_varlant 


H£lU 


eng x colloquial 


City by the Bay 


City of the Golden Gate 

Fog City 

Fog Cty 

Frisco 

Golden City 

S Fran 

S. Fran 

San Fran 

yes fix 

The City 
S.F. 

Bay Area 

S.F. Bay Area 

The City by the Bay 

Baghdad by the Bay 

The Paris of the west 

Ess Eff 

SFC 

San Francisco City 


things I have written elsewhere 


#201 60408 


Yes No Fix 


Yes No Fix 



This post was originally published on the Mapzen 
weblog https://mapzen.com/blog/yes-no-fix/ , in April 2016 . 

tl;dr 


Opinions and fact-checking. About stuff. As CSV documents. From any webpage. Or at 
least Who’s On First Spelunker https://whosonfirst.mapzen.com/speiunker/ webpages. 
With code https : / /github.com/whosonf irst/ js-mapzen-whosonf irst-yesnof ix . 


A short history 

In the list of Really Good Things on the Internet These Days I still think the work done by 
the Government Digital Service https : / /gds . blog . gov . uk/ in the United Kingdom to 
rebuild the gov.uk https://gov.uk website is at, or near, the top. A close second would be the 
New York Public Library's (NYPL) Building 
Inspector http://buildinginspector.nypl.org/ project. 

The Building Inspector project began after the NYPL developed a suite of computer 
vision tools for extracting the building footprints https : //github . com/nypi- 
spacetime/map-vectorizer from their extensive maps collection. The results were remarkably 
good for an automated process but still not perfect. Occasionally the software would get things 
wrong but sometimes it would return results that were maybe not wrong but not entirely correct 
either. To deal with these inconsistencies the NYPL created the Building Inspector 
website http://buildinginspector.nypl.org/footprint/ and asked the public to help OUt 
by asserting whether a footprint computed by their software was correct (yes), incorrect (no) or in 
need of some fixing (fix). 


It looks like this: 




The project https://github.com/nypl-spacetime/building-inspector has since 
evolved to allow contributors to vet more than just building footprints and the N YPL recently 
announced that people have contributed more than 1.5 million 

tasks https://twitter.com/nypl_labs/status/716999124319670272 since launching. That 
original concept to atomize the problem (individual building footprints) and then ask the public 
for simple observations (yes, no, fix) remains a stroke of genius. So we thought we'd try copying 
it. 

How does it work 

Yes No Fix is a single Javascript library with a pair of public API methods. Each method 
takes an arbitrary data structure as its input and renders it as a nested HTML table where each 









value (at the end of a nesting) has interactive controls to allow a viewer to assert an opinion (yes, 
no or fix) about that value. The second method will also append the rendered table to an existing 
DOM element in the webpage it was called from. 


Here's an example using the names section for the city of San 
Francisco https : //whosonf irst .mapzen . com/spelunker/id/85922583/ from the Who's On 
First Spelunker https : / /whosonf irst .mapzen .com/ spelunker / . The raw data looks like 

this: 

"properties": { 

" name :chi_x_pref erred" : [ 

"\u65e7\u91dl\u5c71 " 


" name : chi_x_variant " : [ 

" \u820a\u91dl\u5c71 " 

], 

"name:eng_x_colloquial" : [ 
"City by the Bay", 

"City of the Golden Gate", 
"Fog City", 

"Fog Cty" , 

"Frisco" , 

"Golden City", 

"S Fran", 

"S. Fran", 

"San Fran", 

"The City", 

" S . F . " , 

"Bay Area", 

"S.F. Bay Area", 

"The City by the Bay", 
"Baghdad by the Bay", 

"The Paris of the West", 
"Ess Eff", 

" SFC " , 

"San Francisco City" 

], 


// and so on 


And the rendered version looks like this: 



name 


eng_x_colloquial 


chi_x_preferred 

chi_x_variant 


ib£Oj 

City by the Bay 


City of the Golden Gate 

Fog City 

Fog Cty 

Frisco 

Golden City 

S Fran 

S. Fran 

San Fran 

The City 

S.F. 

Bay Area 

S.F. Bay Area 

The City by the Bay 

Baghdad by the Bay 

The Paris of the West 

Ess Eff 

SFC 

San Francisco City 


eng_x_preferred 

eng_x_unknown 


San Francisco 


SF 


When you mouse over a value - in this case the English colloquial name of The City - 
you'll see an edit control. 


name 


engx colloquial 


chi x preferred 


chl_x variant 


IB&UJ 

S£lh 

City by the Bay 


City of the Golden Gate 

Fog City 

Fog Cty 

Frisco 

Golden City 

S Fran 

S. Fran 

San Fran 

The City 

S.F. 

Bay Area 

S.F. Bay Area 

The City by the Bay 

Baghdad by the Bay 

The Paris of the West 

Ess Eff 

SFC 

San Francisco City 


If you click on it then a series of controls (yes, no and fix) will appear next to that value. 


Like this: 


Yes 


The first is yes which means that you agree with the value (and its parent nesting). 


name 


chi_x_preferred 
chi_x_variant 
eng_x_ colloquial 


IB&lli 
m&ii J 

City by the Bay 

City of the Golden Gate 

Fog City 

Fog Cty 

Frisco 

Golden City 

S Fran 

S . Fran 

San Fran 


|^f The 

City 


yes 

no | j fix 

cancel 


S.F. 

Bay Area 

S.F. Bay Area 

The City by the Bay 

Baghdad by the Bay 

The Paris of the West 

Ess Eff 

SFC 

San Francisco City 


No 


The second is no. No means no. https://en.wikipedia.org/wiki/Nomeansno San 


Francisco is not called Frisco. 


name 

chixpreferred 
chi_x_varlant 
eng_x colloquial 


*S£0j 


City by the Bay 
City of the Golden Gate 
Fog City 
Fog Cty 

Frisco 


yes ; 

no 

fix 



Golden City 
S Fran 
S . Fran 
San Fran 
The City 
S.F. 

Bay Area 

S.F. Bay Area 

The City by the Bay 

Baghdad by the Bay 

The Paris of the West 

Ess Eff 

SFC 

San Francisco City 


[Ed. Some of US think Frisco is https : //www. thrillist . com/ entertainment/ san- 
f rancisco/san-f rancisco-nickname-f risco-sf actually 

OK http://www.buzzfeed.com/burritojustice/frisco-24wct . Which kind of proves the 
point of Yes No Fix. San Fran on the other hand...] 


name 


eng_x_colloquial 


chi_x_preferred 


chi„x variant 


iH&m 
H£0 j 

City by the Bay 


City of the Golden Gate 

Fog City 

Fog Cty 

Frisco 

Golden City 


S Fran 
S . Fran 


^ San 

Fran 



yes | 

no 


fix 


cancel 


The City 
S.F. 

Bay Area 

S.F. Bay Area 

The City by the Bay 

Baghdad by the Bay 

The Paris of the West 

Ess Eff 

SFC 

San Francisco City 


Fix 


The third value is fix which means broadly this is weird data. If that seems a little vague 
and ambiguous that is because it's meant to be. "Fix" is a shorthand for things that are sort of 
correct and but still incorrect or vice versa. Life is complicated that way. 


name 


chi_x_preferred 

chi_x_varlant 

eng_x_colloquial 


R£Uj 

City by the Bay 
City of the Golden Gate 
Fog City 
Fog Cty 

Friaco 

Golden City 
S Fran 
S . Fran 
San Fran 
The City 
S.F. 

Bay Area 
S.F. Bay Area 
The City by the Bay 
Baghdad by the Bay 
The Paris of the west 

[ ESS Eff 


yes 


no 


fix j cancel j 


SFC 

San Francisco City 


Locked 

Sometimes a value might be "locked" or "excluded" which means that it is not possible to 
make a yesnof ix style assertion about it. The reasons why something might be excluded are 
defined by individual applications. We'll explain how that's done, below. In this example the 
edtf : inception and edtf : cessation dates are locked because they already have a 
default value of "unknown" so there's not a lot of use in collecting opinions about them. 


Properties 

— some notes about sources and names view raw 

show report 


edtf 


cessation 

UUUU 

inception 

£jl uuuu 

geom 


area 

0.061408 

bbox 

-123.173825,37.63983,-122.28178,37.929824 

latitude 

37.759715 

longitude 

-122.693976 


Reports 

In the screenshot above there is a show report button. When clicked it will display 
three more elements: A comma-separated value (CSV) rendering of all the assertions that have 
been made so far and another button for submitting the report (and a button to hide everything). 


Properties — some notes about sources and names 


view raw 


hide report 


submit report 


path, value, assert ion, date 

name . eng_x_colloquial#10 ,S.F. , 1, 2016-04-02T17 : 55 : 36 . 335Z 
name . eng_x_colloquial#9, The City, 1 , 2016-04-02T17 : 55 : 50. 767Z 
name. eng_x_colloquial#4, Frisco, 0, 2016-04-02T17:56: 17. 137Z 
name . eng_x_colloquial#16, Ess Ef f , -1, 2016-04-02T17 : 57: 38 . 917Z 


edtf 

cessation uuuu 

inception uuuu 


By default, Yes No Fix will "submit' 1 the report to a new browser window because that's all 
it knows how to do. Here's a plain-text version of the report shown in the screenshot above: 



path , value , assertion , date 

name . eng_x_colloquial#10 , S.F.,l,2016-04-02T17:55:36.335Z 
name . eng_x_colloquial#9 ,The City, l,2016-04-02T17:55:50.767Z 
name. eng_x_colloquial#4 , Frisco, 0,2016-04 -02T1 7 :56:17.137Z 
name . eng_x_colloquial#16 ,Ess Eff,-l,2016-04-02T17:57:38.917Z 

Reports are formatted as CSV documents, with four columns: 

1. path represents the nested keys from your data structure (in this case the 
properties dictionary from the underlying GeoJSON file for San Francisco 
collapsed in to a string using a . notation as a delimiter. 

2. value is the raw value that someone is commenting on. 

3. assertion is a ( signed ) integer; 1 means yes, 0 means no and -1 means fix. 

4. date is an ISO-8601 date string indicating when the assertion was made 

A couple things to note about paths: 

• As of this writing there are still some explicit Who's On First -isms left in the Yes No 
Fix code. Specifically the expectation that keys have a colon-separated prefix (for 
example name : eng_x_colloquial ) that is parsed and used to group things in 
to buckets. Keys that don't have a prefix are automatically grouped in to bucket called 
_global_, so if you had a key simply called date it would be encoded as 
_global_ . date in the final CSV report. 

* Array values in a path are denoted using a #<OFFSET> syntax. For example 
name. eng_x_colloquial#16 is the 16th element in the 
properties [ ' name : eng_x_colloquial ' ] array. 


How to use yesnof ix . j s 

First grab a copy of the code from the js-mapzen-whosonfirst-yesnofix GitHub 
repository https://github.com/whosonfirst/js-mapzen-whosonfirst-yesnofix . Then 
add it to your webpages, like this: 


<link rel=" stylesheet" type=" text /css" href="mapzen.whosonf irst .yesnofix.css" /> 

<script type= " text / j avascript " src= "mapzen . whosonf irst . yesnof ix . j s " ></ script> 



The simplest way to use yesnof ix. js is to call the apply method with a target HTML 
element and a data structure. This will generate a pretty HTML table complete with Yes No Fix 
style controls for each value and insert it in to the DOM as a child of the target HTML element 
you've defined. 


mapzen.whosonfirst.yesnofix.apply(data, target_el) ; 

If you just want to render a data structure but delay or defer adding it to the DOM you can 
call the render method. 


var pretty = mapzen.whosonf irst . yesnofix. render (data) ; 

That's it. By default every element in your data structure will be made Yes No Fix -able. 

Customizing things 

Warning: This is the part where things start to get a bit nerdy. Where "a bit nerdy" really 
means VERY VERY NERDY. If you're not in to the nerdy bits you should have enough 
information to get started and can just jump to the bottom of the post #speiunker . 


One of the things that quickly became apparent integrating yesnofix . js with the 
Who's On First spelunker is that many things needed to be customized. The whole point of a 
spelunker is to be able to jump around between 

documents https://mapzen.com/blog/spelunker-jumping-into-who-s-on-first/ SO at a 
minimum we would need a way to teach the yesnofix . j s rendering code to display certain 
things (like IDs) as links. 

Customing things - Values 

To do this for values you need to invoke the set_custom_renderers method passing 
"text" as the first argument and a custom function as the second argument. This function will be 
invoked for each value that the yesnofix . j s code tries to render. 


Your custom function will be invoked with two arguments: data which is the actual value 
in question and ctx which is the nested path in dot notation (described above) that contains 
data. Your custom function is expected to either return a function (that itself returns an HTML 
DOM element) or null. If your callback's response is null then the code will simply include 
the raw value as-is. 



The yesnof ix . j s code defines some handy helper methods for common tasks (like 
render_code or render_link) but in the example below you can see how we are also 
defining some custom methods, like render_wof_id. 

var possible_wof = [ 

' wof .belongsto ' , 

' wof . parent_id ' , ' wof . children ' , 

// as so on. . . 

]; 

var text_callbacks = { 

'wof .id' : mapzen .whosonf irst . yesnof ix.render_code, 

/ / and so on. . . 

>; 

var text_renderers = function (d, ctx){ 

if ( (possible_wof . indexOf (ctx) != -1) && (d > 0)){ 
return self. render_wof _id ; 

} 

else if (text_callbacks[ctx] ) { 
return text_callbacks [ctx] ; 

} 

/ / and so on. . . 
else { 

return null; 

} 

>; 

' render_wof_id ' : function(d, ctx){ 

var root = mapzen. whosonf irst. spelunker.abs_root_url( ) ; 

var link = root + "id/" + encodeURIComponent (d) + 

var el = mapzen. whosonf irst. yesnof ix.render_link( link, d, ctx); 

var text = el. children [ 0 ] ; 

text.setAttribute( "data-value" , mapzen. whosonf irst. php.htmlspecialchars(d) ) ; 

text. setAttribute ( "class" , "props-uoc props-uoc-name props-uoc-name_" + mapzen. whosonf irst .php.htmlspecialchars (c 
return el; 

} 

mapzen. whosonf irst .yesnof ix. set_custom_renderers( 'text' , text_renderers ) ; 

In this example we are rendering things that are WOF IDs (wof.parent_id, 
wof . belongs_to, and so on) as links but we aren't rendering wof . id as a link since there is 
no point in linking to the webpage we are already looking at. 

Customizing things - Dictionaries 

The second thing we needed to customize were the value of keys themselves. For example, 
we define concordances in Who's On First using short prefixes for other sources. A 
Geonames http://geonames.org ID becomes gn: id. a Library of 
Congress http://ioc.gov/ ID becomes loc : id and so on. That's useful and efficient for 
encoding data but not very satisfying to look at. 

Just like text Tenderers, dictionary Tenderers are defined by invoking the 
set custom Tenderers method with "diet" as the first value and a custom function that 



returns a function (that returns a string) or null. 


var dict_mappings = { 

' wof . concordances . gn : id ' : ' geonames ' , 

' wof . concordances . gp : id ' : ' geoplanet ' , 

' wof .concordances . loc : id ' : 'library of congress', 

/ / and so on. . . 

]; 

var dict_renderers = function (d, ctx){ 

if ( dict_mappings [ ctx ] ) { 
return function(){ 

return dict_mappings [ ctx ] ; 

}; 

} 

return null; 

}; 

mapzen.whosonf irst .yesnof ix. set_custom_renderers ( 'diet' , dict_renderers ) ; 

In this example loc: id becomes "library of congress" and so on. You may noticed in the 
screenshots above that we haven't yet defined custom handlers for the name : properties so they 
all still get rendered with names like "eng_x_variant" or "chi_x _preferred" . We should fix that. 

Customizing things - Exclusions (or locking things) 

Finally some things just aren't up for debate. The reasons why an application may not want 
solicit feedback on certain bits are data are many and varied so we'll just leave it at at. The point is 
that you may need to prevent certain properties from being Yes No Fix -able, so you can. 


You do this by invoking the set_custom_exclusions method padding "text" as the 
first argument and a custom function that returns a function (that return a boolean value). Like all 
the others, your custom function will be invoked with a data (the value) property and a ctx (the 
context or path) property. You might be starting to see a pattern by now. 


var text_exclusions = function (d, ctx){ 

return function(){ 

if ( ctx. match ( /^geom/ )) { 
return true; 

} 

else if ( (ctx. match ( /''edtf/ ) ) && (d == "uuuu")){ 
return true; 

} 

// and so on . . . 
else { 

return false; 

} 

}; 

}; 

mapzen.whosonf irst .yesnofix. set_custom_exclusions ( 'text' , text_exclusions ) ; 



In this example we are locking things prefixed with geom because their values are derived 
from Sean Gilles https://pypi.python.org/pypi/shapeiy ... I mean , math. We are also 
locking things prefixed edtf (for the Library of Congress' Extended Date/Time 
Format https://ioc.gov/standards/datetime/ ) whose value is already uuuu which is the 
shorthand for "unknown" . 

Customizing things - Report handlers 

Yes No Fix doesn't concern itself with what happens to a report, by design. 

Its only concern is with the user controls for collecting assertions and finally rendering 
them in to a blob of CSV text. After that it is left up to individual applications using 
yesnof ix . js to tell it what to do. An application might post the data to a remote server using 
an API, send the report somewhere via email, try to analyze the data locally and perform an 
action. Whatever. 


Custom report handlers are just plain old Javascript functions that accept a string (the CSV 
text of the report) as their only argument, like this: 


var submit_handler = function ( report ) { 

//do something with report here - the point is 
// that yesnof ix.js is agnostic about the meaning 
//of "do something" 

}; 

mapzen. whosonf irst .yesnof ix. set_submit_handler ( submit_handler ) ; 
mapzen.whosonfirst.yesnofix.apply(data, target_el) ; 

That's it. If your application has defined a custom submit handler it will be invoked when a 
user presses the submit report button. If you haven't defined a custom handler then the 
default behaviour is display the CSV text of the report in a new browser window. 

Customizing things - enabling or disabling editing controls 

For a little over a month now yesnof ix.js has been running quietly on the Who's On 
First Spelunker https : / /whosonf irst .mapzen .com/ speiunker/ , but disabled. What that 
means is that the properties displayed for a place are being rendered by yesnof ix . j s but the 
editorial controls have been turned off. We did that with a handy enabled method that accepts a 
boolean as its only argument. 


mapzen. whosonfirst .yesnofix.enabled( false) ; 



As we got ready for this blog post we simply removed that method call. For the technically 
minded, the Spelunker's 

mapzen.whosonfirst.properties.js https://github.com/whosonfirst/whosonfirst-www- 
spe lunker /blob /mas ter /www/ s t at ic/ javascript /mapzen.whos on fir st . properties . j s files 

offers a concrete example of all the customizations described so far. 


Yes No Fix and the Spelunker 


I mentioned that Yes No Fix is now enabled on the Who's On First Spelunker, which 
is only half-true. Yes No Fix is absolutely enabled and we'd love for you to try and it out and 
let us know https://twitter.com/aiiofthepiaces what does and doesn't work. As of 
this writing though any reports you generate don't have a place (in Who’s On First -land) to go 
yet. 


They will but, again by design, the data reporting is meant to be entirely separate from the 
data collection. We're finishing up the details on the data collection piece and so, in the meantime, 
we're going to light up the data reporting side of things and see how that works first. 



i*.. / 

You have found an experimental feature! 

Thank you for taking the time to fact-check this data. There are two pieces to any data collection project: the reporting 
and the collecting. If you're reading this it means that only the reporting piece is live for Who's On First. We expect the 
collection piece to be live shortly but in the meantime you can generate a text version of your report. Soon you will be 
able to send it to Who's On First directly If you'd like to know more about this project all the details are available in 
this blog post: 

https :// mapzen .com/blog/yes- no- fix/ 

Tokyo is a locality and it 

Properties 

hide report sub 

h 1 ”" l-H 

path, value, as 

name. jpn_x_prl 

iant#l , Tokioon , 0 , 2016-04-07T18 : 07 : 01 . 525Z Other 


See all the descendants of Tokyo 


Once both of these things are up and running the next task will be to use those reports to 
inform the editorial tools and processes https : //github.com/whosonfirst/whosonfirst-www- 


boundaryissues we are starting to build for managing Who's On First. But that's another blog 
post for another day. 


Yes No Fix is not just a Who's On First (or a Mapzen) thing 



Yes No Fix is not designed to work with any one application but rather arbitrary blobs of 
data. For example, once we finish getting the data collection hooks working the obvious next step 
is to make the neighbourhoods (and other placetypes) in the Who's On First I Am 
Here https://mapzen.com/blog/iamhere/ map Yes-No-Fix -able. 


My current mornings-and-weekends project is building a tool that will create a static 
HTML archive https://github.com/thisisaaronland/go-cooperhewitt-shoebox of all 
the things I've collected at, or on the website of, the Cooper Hewitt Smithsonian Design 
Museum https://coiiection.cooperhewitt.org . The tool uses the museum's 
API https://collection.cooperhewitt.org/api/ to fetch data about the 

objects https : / /collection . cooperhewitt . org/api/methods/cooperhewitt . objects . getlnft 
I've collected as well as the act of 




collecting https : / /collection . cooper hewitt . org/ api/methods/cooperhewitt . shoebox, ite 
ms.getlnfo them. 


Like the properties in a Who's On First document the Cooper Hewitt data is encoded as 

blobs of 

JSON https : // github . com/ cooperhewitt/collection/blob/master/ob jects/ 184/9 15/ 01/1 
8491501 . json which is great for robots but not really great for humans to look at. I was able to 
drop the yesnof ix . j s libraries in to my code and point the API response blobs at 

them https : //github.com/thisisaaronland/go-cooperhewitt- 

shoebox/blob/master/ javascript/shoebox . item. js#L59-L81 and poof there was something 
a little less terrible to look at. Just the way the Spelunker has been using yesnof ix . j s with the 
editorial controls disabled, for over a month now. 





collect 

1460053888 

97244129 

0 

1460053888 

accession_number 1976-84-1 

creditline 

date 

decade 


Lamp 1976-84-1 

action 

created 

description 

id 

is_public 

lastmodified 

refers_to 


This tool is still very much a "wet paint" project so it hasn't been taught to use any of the 
fancy rendering tricks described above but that's really just a question of time-and-typing. 
Likewise the actual Yes No Fix controls are disabled but maybe it's worth enabling them and 
creating a custom submit handler directing people to send their reports to the museum via their 
Zendesk account https : / /cooperhewitt . zendesk.com/hc/en-us/requests/new . 


Cooper Hewitt, Smithsonian Design Museum > Submit a 'equest 

Submit a request 

Your email address* 

example@mapzen.com 

Subject * 

Object ID 18491501| 

Description * 

path, value, assertion .date 

_global_.dimensions_raw.depth#0,22.50,-1 ,2016-04- 
07T1 9:20:33. 355Z 

Attachments 

Add file re 


This is what we mean when we say that Yes No Fix is designed to be agnostic about what 
happens with the data it collects. Also... OMG, that 

lamp! !! ! https : //collection .cooperhewitt .org/ob jects/ 184 9 15 01/ 

Version Less-than-one 

It is still early days for yesnof ix . js and there are many things it doesn't do or doesn't 
do as well as it should yet. A short and not-comprehensive list includes: 





Keyboard shortcuts 


* Working well (or at all) on mobile devices 

* Working well with multi-level nested data structures, specifically where to put all 
those nested tables on a finite amount of screen space and then where squeeze in the 
editorial controls 

* Support for any language besides English 

* Proper testing for browser support 

* Better documentation, particularly for all the customization handlers 


But it's a start. It's also, we hope, a start at finding a middle ground between not accepting 
any feedback about the data in Who's On First https://whosonfirst.mapzen.com and 
throwing the doors wide-open and letting anyone edit whatever they want. The latter just isn't 
going to happen soon for a whole bunch of reasons but the former also feels kind of rude. Yes No 
Fix is meant to be a way for people to contribute to the data and, once the data collection piece is 
complete, for that work to have a safe place for that work to live on the internet and to start to be 
used to affect the final editorial work. 


Yes No Fix is not a perfect solution to the problem, and there is plenty of work left to do, 
but our hope is that it will at least make things a little better than they were yesterday. 


2016 - 04-08 
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Missing the Point- GeolP's, Points, Polygons,... 


Missing the Point- GeolP's, Points, 
Polygons, and a Precarious Farm in 
Kansas 



This post was co-authored with Dave 

Riordan https : //twitter . com/riordan and originally published on the Mapzen 
Weblog https://mapzen.eom/blog// , in April 2016. 
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There are few things more dangerous than an overconfident point when it’s 
placed on a map. These sorts of points are intended to represent a place, like a house, a 
town, a city, or a country. But what happens when the thing not underneath that place is 
not that place (or is a fundamental misunderstanding of what that place is supposed to 
be)? 


This past weekend, Kashmir Hill https://twitter.com/kashhiii of 
Fusion https://fusion.net reported a terrifying 

Story https : / / fusion.net/story/287592/internet-mapping-glitch-kansas- 



farm/ that reflects the dark side of what happens when map data is misinterpreted 
and people are overconfident in what lies beneath that dot. 


The story revolves around a small farm in Kansas that has been the victim of 
mistaken identity many times over. As Kashmir writes: 


They’ve been accused of being identity thieves, spammers, scammers and 
fraudsters. They’ve gotten visited by FBI agents, federal marshals. IRS 
collectors, ambulances searching for suicidal veterans, and police officers 
searching for runaway children. They’ve found people scrounging around 
in their barn. The renters have been doxxed, their names and addresses 
posted on the internet by vigilantes. Once, someone left a broken toilet in 
the driveway as a strange, indefinite threat. 


And to get to the underlying cause of this horrible case of mistaken identity is 
rather common technology called GeoIP. GeoIP providers attempt to connect clusters 
of IP addresses to a geographic region. Sometimes, that can be fairly specific, down to 
the city, other times it can only link to a state or country. If Facebook has asked you if 
you have tried to log in from another country you haven't visited, or a map that centers 
on your town by default, this is likely the technology that makes that possible. 
Sometimes, though, it centers on the wrong place, like when dial-up AOL had all their 
IP addresses coming from northern Virginia. 


MaxMind, the most prominent GeoIP provider, only intended to give back the 
"general area" the IP address is in; not to indicate that the precise location lay beneath 
the pin. But that's exactly what many of the users of MaxMind's users have assumed 
that the data indicated. Which meant that any IP address that's known to be 
"Somewhere in America" (but can't be pinpointed to a specific city or state), MaxMind 
pointed right at this family farm. In some cases, these GeoIP leads can be useful, but 
when it all gets boiled down to that point, the nuance is often lost. And that can have 
drastic repercussions. 



Part of the reason we're writing this is to point out that we have our own 

project to augment the MaxMind GeoIP 

database https://whosonfirst.mapzen.com/mmdb with data from Who's On 
First https://whosonfirst.mapzen.com/ to interpret the results coming back 
from an IP lookup as a geographic area, and not a single point. Rather than sending 
back a point and some words saying where an IP address is location our modified 
version of the MaxMind database returns both a Who's on First ID and a bounding box 
(as well as its complete hierarchy) for that location. It means the United States is the 
container for the United States and that small town in Kansas is just that small town in 
Kansas. This is still an experimental project and we are working through the mechanics 
of what we store in the databases (we could include complete polygons but that might 
makes things a bit... heavy) and how often things get updated. As of this writing we 
haven't yet updated our databases to reflect the changes that MaxMind has made to 
their own data http: / / fusion.net/story/290772/ip-mapping-maxmind-new-us- 
def ault-location/ yet. 


search for a place 






\ 



We have been "helpful" and auto-positioned the map for you... 

Using your computer's IP address we've asked the computer-robots-in-the-sky where in the world they think you might be right 
now. They seem to think you are somewhere near or around New York. We've used this information to auto-position the map 
accordingly. Sometimes the mappings from IP address to location are weird. Sometimes they are just wrong. Sometimes 
computers being "helpful* like this is weird and creepy so we've added a setting to allow you to disable this feature in the future. 

C Please disable IP lookups altogether 

Do not show this notice again. 


the map is centered at 40.697299,-73.979187 at zoom level 9 which appears to be somewhere in New York Navy Yard or Fort Greene 


One of the other things this story has prompted us to do it finish up the work to 
enable IP lookups in the I Am Here 


project https : / /mapzen. com/ blog/ iamhere/ . When enabled the code on the I 
Am Here website https : / /whosonfirst .mapzen. com/ iamhere will try to use your 
computer's IP address and its corresponding location (using a Who's On First enabled 
MaxMind database https://whosonfirst.mapzen.com/mmdb/ ) to automatically 
position the map. Previously the map would always load centered on San Francisco's 
majestic Space Claw https://www.flickr.com/people/spaceclaw/photosof/ ...I 
mean Sutro Tower, which is great but a bit tiresome if don't live in San Francisco and 
always need to start using the website by zooming out to a different location. 


The reason that we haven't enabled the IP lookup functionality sooner is 
because we noticed that sometimes when you load the website from a computer in San 
Francisco the map would automatically position itself in... you got it, Kansas. Now 
when you loadl Am Here https://whosonfirst.mapzen.com/iamhere/ for the 
first time you will see a modal dialog explaining that there are computers trying to be 
helpful, that sometimes their suggestions aren't very helpful and finally the option to 
tell the computers to stop helping you. Hopefully the computers will get it right more 
often than not and, because IP lookup is pretty cool when it works, there will be a way 
for people to use I Am Here without having to always start in San Francisco. 


But this Story, and others http: / / fusion.net/ story/214995/ find-my- 
phone-apps-lead-to-wrong-home/ like 

it https : / /gimletmedia. com/ episode/53-in-the-desert/ should be a wake-up 
call to folks who design geospatial systems. When conveying ambiguity, it's hard to 
think how your users' users will necessarily interpret the thing. And while longitude 
and latitude are ubiquitous, they're not always right to send along the context involved. 

MaxMind has already taken some corrective 
Steps http: / / fusion.net/ story/290772/ ip-mapping-maxmind-new-us -default- 
location/ by changing the point that represents America from the farm at [38,-97] 
to the center of a nearby lake, but it still doesn't address the necessary issue of 
conveying how big a place match may be. 


But to them, and to you, please: Consider the polygon. 
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Mapping With Bias 




The following are the slides and notes for a talk I presented 

at the Mapping With 

Perspective https : //mapzen . com/blog/mapzen-sf-event-april- 
2 6 / event held at the Mapzen 

West https : //whosonf irst .mapzen. com/spelunker/id/907212541 
/ offices, in April 2016. This post was originally published on the 
Mapzen weblog https://mapzen.com/blog/mapping-with- 
bias/ , in August 2016. 


If you \e read the first introductory blog 
post https://mapzen.com/blog/who-s-on-first/ aboutWho's 
On First or the talk I did at 

F0SS4G https : //mapzen . com/blog/spelunker- jumping-into- 
who-s-on-f irst introducing the Spelunker much of what follows 
will be familiar. It was a short talk so rather than get lost in the 

technical 

details https : //WWW. github.com/whosonfirst/whosonfirst- 
docs/ I tried instead to focus on some of the principles, and 
statements of bias, that influence the project. Why we're doing this, 
rather than how, particularly for people who might not have read 
those first two blog posts. 


My talk was titled "Mapping with Bias " and this is what I 

said. 


I had stickers made for the Who’s On 
First https://whosonfirst.mapzen.com/ project recently. They 
look like 

this. /weblog/2016/08/15/things/images/stickers . jpg 


No one has any idea what they depict and that’s led to some 
hilarious speculation about what they "are" ranging from a hockey 
stick (I am from Canada) to the Lexus car logo to an e- 
cigarette https : //twitter . com/alloftheplaces/status/720359 


919749169152 



It’s actually a pitot 

tube https : //www. grc . nasa. gov/WWW/k- 
12 /VirtualAero/BottleRocket /air plane /pitot . html 

Wikipedia describes a pitot 

tube https://en.wikipedia.org/wiki/Pitot_tube as: 


. . .a pressure measurement instrument used to 
measure fluid flow velocity. 


It goes on the describe "flow velocity" as: 


...a vector field which is used to mathematically 
describe the motion of a continuum. 


Who’s On First https : //whosonf irst .mapzen . com 
not really a pitot tube but I like that idea that there might be an 
instrument to measure the motion - the velocity - of people’s 
understanding of place. 

Also, I like shiny 

things /weblog/2016/08/15/things/images/stickers . jpg 




Who's On First https : //whosonf irst .mapzen . com is a 
gazetteer. A gazetteer is a "big bag of places" in which every place 
has a stable and permanent identifier, supporting metadata and 
pointers to other IDs in the gazetteer for places with which it has a 
relationship. 

Over 15, 000 words have written about Who's On 

First https://whosonfirst.mapzen.eom/#theory SO far 
because it turns out that gazetteers are a pretty complicated subject. 




Rather than trying to squeeze all the 
details https://mapzen.com/tag/whosonfirst/ in to a 20- 
minute talk I am going to focus instead on some of the first 
principles motivating the project and governing our day-to-day work. 




The toxic trinity of "geo" has always been the unholy union 
of: licensing and coverage and quality. 

The aim of Who's On 

First https : //whosonf irst .mapzen . com is to tackle all three at 
the same time. 




If and when we are forced to "pick 

two" https : //en. wikipedia.org/wiki/Project_management_tr 
iangie#. 22Pick_any_two .22 we will choose licensing and 
coverage, so that in the end there is always something left over that 
people can improve as time and circumstances permit. 

These decisions make Who's On 
First https://whosonfirst.mapzen.com both ambitious and 
daunting so I have always felt that it is important for us to have a 
governing bias with which to negotiate the complexities and the 
quicksand that the project will inevitably yield. 


In many ways, these are principles to help us understand 
what the project is not and to help us understand how things should 
be even if the technology doesn't always work as well as we imagine 
it should, yet. 


A gazetteer is a pretty brainy project. Gazetteers are one of 
those things that don't seem important at all until they are, at which 
point they suddenly take on an outsized importance. This means we 
need to design things in such a way that our work can outlast 
people's reluctance. 

We need to build something with the patience and the 
stamina, conceptually and financially, to sit quietly in the corner and 
be ready to be of service when you are and not before. 



What follows are six "umbrella" ideas that we keep in mind 
as we work towards that goal. 



a gazetteer 

of consensual hallucinations 


The first is the idea that Who’s On 
First https://whosonfirst.mapzen.com/ is a gazetteer of 
consensual hallucinations. 





The history of geospatial technologies has for most part been 
one of force 

projection http: //www.aaronland. info/weblog/2012/03/13/god 
helpus/#sxaesthetic and tax 

collection http: //WWW. aaronland. info/weblog/2015/02/24/eff 
ort/#hoiodeck . Some people argue they are the same thing. 


But there is a good reason that, for example, California alone 

has seven state 



planes http : / /www . conservation . ca . gov/cgs /inf ormation/geo 
iogic_mapping/state_piane : There are actual tax dollars, and 
services like emergency responders , that depend on being able to 
precisely and accurately locate a thing in a world where latitude can 

not be neatly subdivided in to equal 

units https : //en .wikipedia. or g /wiki /Geograph ic_coordinate 
_system#Complexity_of_the_problem across the surface of the 
globe. 


It is worth remembering that coordinate space is one of the 
truly great abstractions. Being able to reduce the problem, in so 
many cases, to fit a Cartesian grid has made some pretty amazing 
things possible. 


On the other hand, I cut my teeth working on geo at Flickr 
where we learned over and over and over again that no one thinks of 
place as coordinate 

data http: //code . f lickr . net/category/geo at least not when 
it comes to their photos https://www.flickr.com/places/ 


It’s not just Flickr. There is a long and growing list of 
companies - really, all companies - whose services are implicitly 
built around social rather than administrative notions of place. And 
social notions of place are messy and an inexact and complicated. 
This is the space where Who's On 
First https://whosonfirst.mapzen.com sits. 



Who's On First https : //whosonf irst .mapzen . com is, 
by design, not a gazetteer of unitary perspective. 



all the places 

or: “multiple geometries” 


Or put another way, Who’s On 

First https://whosonfirst.mapzen.com/ is not a gazetteer of 
geometries. 





One of our earliest decisions was on that each record in 
Who’s On First https://whosonfirst.mapzen.com would 
contain multiple geometries. 


As a rule every place in Who's On 
First https://whosonfirst.mapzen.com should contain a SO- 
called "ground truth" geometry. A ground truth geometry is like 
Benoit Mandelbrot’s map of 

England https ://en. wikipedia . or g/wiki /How_Long_I s_the_Co 





ast_of_Britain%3F_Statistical_Self- 

Similarity_and_Fractional_Dimension which, by definition, 
means it will always grow in size and detail. 


Some places might even have two ground truth geometries, 
one clipped to the coastline and another than includes territorial 
waters. These geometries are especially good for reverse 
geOCOtling https : //github.com/whosonfirst- 
data/whosonf irst-data/issues/367 but the salient point in all of 
this is that the geometries themselves, as often as not, enforce the 
biases of their use. 


There might also be "folk" geometries which encode a 

common (or folk) understanding of a 

place https : //github. com/whosonf irst-data/whosonf irst- 
dat a/blob/master /data/ 859/22 5/83/8592 25 83-alt- 
mapzen . geo j son rather than an official designation. Think of Eric 
Fischer's Locals and Tourists 

maps http : / /www . moma . org/interactives / exhibitions /2 0 1 1 /t 
alktome/objects/146200/ 


There will inevitably be disputed geometries. These are 
different from places that have been classified as 

disputed https : //whosonf irst .mapzen . com/spelunker/placety 
pes /disputed/ , places like Kashmir or the Golan Heights. These 
are places where the stakes are not so high. Places where even 



though their may be officially recognized boundaries people still 
bicker over the details. Effectively all 
neighbourhoods http://www.gowanusheights.info , 
everywhere . 


We may disagree on where The 

Tenderloin https : //whosonf irst .mapzen . com/spelunker/id/85 
865 903 /descendants/? 

exclude=nullisland&placetype=microhood Starts and Stops, for 
example, but we all agree that The Tenderloin exists. 


The purpose and the value of Who's On 
First https://whosonfirst.mapzen.com is in giving those 

notions of 

place https : //whosonf irst .mapzen. com/s pelunker/id/ 102 112 17 
/ a collective proof. In giving them a mass and weight and a 
gravity in the universe that other people and products can orbit. 



common ancestors 

because all families are psychotic 


Another core principle of Who's On 
First https://whosonfirst.mapzen.com that is every record 
shares a common set of ancestors. 





Hierarchies, in particular administrative hierarchies, vary 
wildly from country to country. We used to say that all locations in 
Who's On First http s : //whosonf irst .mapzen . com share a 
common hierarchy but I think that was often more confusing than 
not. 


It is an articulation that lends itself to the idea, incorrectly, 
that there is a single comprehesive hierarchy which encodes all the 





relationships between places. That is not what Who's On 
First https://whosonfirst.mapzen.com tries to do. 


Instead we have said that there are five common 
placetypes https : //github.com/whosonfirst/whosonf irst- 
placetypes#here-is-a-pretty-picture — continent, country, 
region, locality and neighbourhood - and that every record in Who's 
On First https://whosonfirst.mapzen.com , regardless of its 
specific 

placetype https : //whosonf irst .mapzen . com/spelunker/placetyp 
es/ has at least one of the common placetypes as an ancestor. 


This acts as a baseline for a global dataset, both on a 
conceptual and a practical level. It is important to us that, within 
reason, we not impose any single architectural approach or set of 
technical requirements in order to be able to use Who's On 
First https://whosonfirst.mapzen.com . 


Five "database" columns for encoding a global hierarchy 
seems like a reasonable trade-off in 2016. If you need to include 

Brooklyn, 

NY https : //whosonf irst .mapzen . com/spelunker/id/42 12 057 65/ 
(which is technically a 

borough https : //whosonf irst .mapzen . com/spelunker/placety 
pes /borough ) in your dataset then you'll need to add a sixth 
column but that's your business. Otherwise you can hopefully make 



do with New York 

City https : //whosonf irst .mapzen . com/spelunker/id/85977539/ 


Importantly, unknown place types are not a fatal error. They 
are left to the needs and discretions of people using Who's On 
First https://whosonfirst.mapzen.com for whatever they need 
to use it for, without sacrificing a common ground where all of these 
projects can still comfortably hold hands. 


There is also a related discussion about places having 
multiple hierarchies but we don't have time for that tonight. Suffice it 
to say that places can and do have multiple 

hierarchies https : //github. com/whosonf irst/whosonf irst- 
piacetypes#hier archies for much the same reasons that a place 
might have multiple geometries. 



all the ancestors 

all of this has happened before 


Who’s On First https : //whosonf irst .mapzen . com is 


not a linear scorched-earth view of the world. 





Places change. The physical boundaries of the USA changed 
141 times https://github.com/whosonfirst- 
data/whosonfirst-data/issues/176 between the years 1789 and 
1959. The entire notion of what Yugoslavia meant changed three 
times in the 20th century before finally atomizing in to seven 
Countries https : //en. wikipedia. org/wiki/Yugoslavia#/media 
/File : Former_Yugoslavia_2 008 . PNG , by 2008. 




Ultimately there is a much larger question about how an 
individual, or worse a community, decides whether an event 
constitutes a simple update versus a fundamental change. This is the 
realm of hard philosophical questions and those are things we are not 
going to try to answer. 


We can provide breadcrumbs, though. Every record in Who's 
On First https://whosonfirst.mapzen.com has both a 
superseded_by and supersedes property that are used to 
signal that a change has occurred but not necessarily why. That part 
is left up to you. 

These properties act as a kind of linked-list for 
places https • / / en .wikipedia • oircj/wiki/Linked list 
indicating, for example, that the Kingdom of Yugloslavia was 
superseded by the Federal People's Republic of Yugoslavia in 1946, 
and so on. 


This decision means two things: 

1. That there might be multiple 

entries https : //whosonfirst .mapzen.com/spelunke 
r/search/?name=stamen%20design for the "same" 
place in Who's On 

First https://whosonfirst.mapzen.com and 
consumers of the data need to account for this fact. 



2. That if you have been using the the first 

iteration https : //whosonf irst .mapzen . com/spelunk 
er/id/57 1704337/ of a place in Who's On 
First https://whosonfirst.mapzen.com its 
meaning and semantics won't suddenly change when 
there is a legimate 

reason https : //whosonfirst .mapzen.com/spelunke 
r/search/?name=stamen%20design to create a 

second 

iteration https : //whosonf irst .mapzen . com/spelunk 
er/id/907212647/ . 

We do this as a way to foster confidence in the robustness and 
durability of Who's On 

First https://whosonfirst.mapzen.com identifiers. The past is 
complicated territory and though it is not the focus of our daily work 
we want to try and make sure that it is always welcome. 



reflect debate 

a gazetteer of signal fires 


Who’s On First https : //whosonf irst .mapzen . com is a 

gazetteer of signal fires. 





It's probably obviously by now but it bears repeating: The 
world is full of complex and contradictory opinions. We do not want 
to try and settle those debates. We can not settle those debates. 


For almost as long as we've had the notion of place itself 
people have had the benefit of complete sentences and entire 
paragraphs and even book- length arguments to make sense of the 
nature and meaning and value of place. 




And still we don 7 agree so I don't know why anyone can 
imagine that a bag of key/value pairs will do better at answering any 
of these questions. 

Obviously there are a few instances where Who's On 
First https://whosonfirst.mapzen.com needs to assert some 
degree of editorial opinion about but as a rule we try to do this as 
infrequently and as transparently as possible. 


When there is genuine debate about something we leave it to 
the consumers of the data to interpret. We want to signal that there is 
debate about something rather than try to gloss over the awkward 
bits. 



failure scenarios 

the data is not the database 


Finally, the data is not the database. 





I mentioned at the beginning that Who's On 
First https://whosonfirst.mapzen.com was designed to 
"outlast people's reluctance". 

What this means is that Who's On 
First https://whosonfirst.mapzen.com is not optimized for 
any one application including 

Mapzen https : / /www . mapzen . com/ , which makes for some 
awkward conversations around the office from time to time. 



What this means, in concrete terms, is that at its core Who's 
On First https://whosonfirst.mapzen.com is a gigantic bag 
of plain-text files https://whosonfirst.mapzen.com/data/ 

The failure scenario for updating a Who's On 
First https://whosonfirst.mapzen.com record should always 
be the ability to edit it using nothing more than a text editor. You 
shouldn't have to do that but when everything else breaks you still 
can do that. 


The point is not that Who's On 
First https://whosonfirst.mapzen.com doesn't play with 
databases but that it should be able to play nicely with all the 
databases. The point is that the demands Who's On 
First https://whosonfirst.mapzen.com places on its users 
should be as universal as possible across platforms and concerns. 


Sometimes this makes getting things set up a little harder 
than we'd like but it's 2016 and we've all gotten pretty good at 

processing text files at 

Scale https : //dl . acm. org/citation . cfm?id=5 12948 and 
feeding them in to databases . 


Despite all the advances we've made over the years it turns 
out that the simplest, most universal and accessible thing is still 
plain-old, plain- vanilla, plain-text hies on disk. 



They have the added benefit of being (still) the most reliable 
way to archive things as the technological landscape 
shifts http : //www. cooperhewitt . org/2 013/ 0 8/2 6 /planet ary- 
collecting-and-preserving-code-as-a-living-ob ject/ , year 
over year. We can print them 

Out http : / /booktwo . org/notebook/wikipedia- 
historiography/ , if necessary. 


This focus - of demanding a high degree of portability and 
durability in our work - is very much influenced by the early 
systems 

designs https : //www. princeton.edu/%7Ehos/frsl22/unixhist/ 
f inaihis . htm for the Unix, and Multics before it, operating 
system and more recently the Unicode http : / /Unicode . org/ 
project. 


These are subjects https : / /www. beii- 
labs . com/us r/dmr/ www/hist . html that could occupy many, many 
more nights of presentations all on their 

Own http: //www. unicodeconf erence . org/ and it remains to be 
seen whether we can accomplish our work as well as they did theirs. 


But that is the work. 



thank you 

whosonfirst.mapzen.com 

@alloftheplaces 


Thank you. If you'd like a sticker send up a 
flare https : //WWW. twitter . com/alloftheplaces 


2016 - 08-15 




sea marshmallows 


go-iiif 


go-iiif 



For a whole bunch of reasons I've found myself thinking 
about the International Image Interoperability 
Framework http: //iiif . io/ which is often just referred to as 
"IIIF" , lately. If you've never heard of IIIF it is a standard developed 
principally by the library and archives community with three 
principal areas of interest : Images, publications and search. 


The first (images) is a standardized URI-based syntax for 
common operations around image manipulation. The second 
(publications) is a declarative syntax for essentially defining learning 
modules around the idea of the slideshow. The third (search) always 
seems to stray quickly in to territory labeled "metadata" which... 
well, is not my jam but neither is it my party so I just try to maintain 
a healthy distance. 

The IIIF Image API http: //iiif . io/api/ image/ 2.1/ 
is the thing that's been coming up a lot in a variety of museum- 
related conversations. Images, and more generally "digital assets", 
have been a bit an albatross around the neck of the cultural heritage 
sector for... basically, forever. The problem has been made worse 
year over year as museums embark on ever more ambitious 
digitization projects that lend themselves to ever more sophisticated 
tools without really bothering to distinguish the layers of concern 
(storage, search, processing and delivery) or the mechanics, and 
more importantly the economics, of how they all fit together. 



Historically the solution has been, and continues to be, 
outsourcing the problem to so-called Digital Asset Management 
System (DAMS) and more recently Image Delivery System (IDS) 
vendors. There is a much larger discussion to be had about that but 
this is not the place, right now. Suffice it to say that if the cultural 
heritage community wants to take on the challenge of standardizing 
on some basic image -related tasks and functionality, and even 
endeavour to write software, common to most institutions then that is 
an unqualified Good Thing. 


Which of course means I had a little bit of a freak out the first 
I read the API spec over coffee, one morning. The details of the 
freak out aren't really important. I can be pretty impatient about these 
things the first time around, not always in a good way. 


The relevant bit, for me, is that I kept asking questions and 
badgering the people I knew who are involved with the IIIF project. 
So many times in fact that eventually it seemed like the best thing to 
do to understand the decisions I was questioning and to test whether 
my criticisms passed muster would be to write an implementation of 
the IIIF Image API. So I 

did. https : //github. com/thisisaaronland/go-iiif 


One of the convenient side-effects of a service that 
standardizes on operations like image resizing and cropping is it 
doubles as a tiled image server. Think a traditional slippy 
map https : // github.com/thisisaaronland/mapzen-slippy- 



map but instead of zooming in and out of "geography" you are 
zooming in and out of really big pictures of "culture" . It is hard to 
explain to people outside the cultural heritage sector just how 
anxious, defeated and envious the sector has been since the Google 
Art Project https : //www. google.com/culturalinstitute 
rolled in to town with their fancy gigapixel cameras and the ability 
to do to works of 

art https : //webcast . gigtv . com. au/Mediasite/Play/cf 66el2e9 
7314208bfd85342171a791bld?catalog=0218e4al-9070-4b7f- 
b05i-54fi933da8e9 what they had previously done to maps. 

Some museums have cobbled together their own solutions for 
making zoom-able images available on their websites; it was 
dehntely one of the things lacking from the Cooper Hewitt 
collections website https: / /collection. cooperhewitt.org/ 
during my time there. Some have even open-sourced their toolkits 
for making image-based slippy maps but nothing has seemed to stick 
across the sector. 

This was the bias that I approached IIIF from. After all, I like 
maps. The result is go- 

iiif http s : //github. com/thisisaaronland/go-iiif and you 
can see a live demo of some of what it does over here: 




https://thisisaaronland.github.io/go-iiif/ 

There is also a local copy at 

http://www.aaronland.info/weblog/2016/09/18/marshmallows/go- 

iiif/ http : //www. aaronland . info/weblog/ 2 016/09 / 18 /marshmall 
ows/go-iiif / for when that link inevitably breaks... 


go-iiif https : //github.com/thisisaaronland/go-iiif 

began life as a fork of Yoan Blanc's iiif Go 


server https://github.com/greut/iiif . Almost immediately it 
morphed in to something different and we quickly 
agreed https://github.eom/greut/iiif/puii/2 that the two 
code bases should continue independently of one another. Here's the 
not-so-short short version of what go- 

iiif http s : //github. com/thisisaaronland/go-iiif does. 


• It moves all of the logic for all of the image processing 
and IIIF's conceptual hoohah in to discrete packages. 
This makes it possible to use the same programming 
logic for both an IIIF server implementation (iiif- 
server https : //github. com/thisisaaronland/go- 
iiif#iiif-server ) and offline tools (iiif-tile- 
seed https : //github . com/thisisaaronland/go- 
iiif#iiif-tile-seed ). 

• It defines separate caching layers for source images and 
derivatives. What that means, as I write this, is that 
there is a disk cache for derivatives and an in-memory 
cache for source images. For example, if you are trying 
to tile a 12MB image on the fly you probably don't want 
to load that same 12MB source file on every single 
request for a 256 x 256 pixel square. Memory usage 

( and subsequent caching) aside waiting ~ 7 seconds for 
each tiny little tile to render is a drag. 



Like the different caching layer source images can also 
be loaded from multiple providers. There are four of 
them so far, but only two will be of interest to most 
people: Things read from disk and things read over the 
network using a URI 

template https : //github. com/thisisaaronland/ go- 
iiif#uri . The most obvious next provider for both 
sourcing and caching images is Amazon 's S3 but I 
would also like add support for reading images from 
Flickr https : //github . com/thisisaaronland/go- 
iiif /issues/16 and other photo -sharing services. 

It allows for individual IIIF features to be enabled or 
disabled via a handy config 

file https : //github. com/thisisaaronland/go- 
iiif /blob/master/README .md#conf ig-f iles . This 
is the place where the ambitions of a specification meet 

the realities of implementation 

details http: //www. aaronland. info/weblog/2008/1 
0/ o 8/tree /#capacity-004 . Sometimes software x 
just doesn't support feature y. The details don't really 
matter so much as being able to accomodate them. 

It allows you to define additional features that are not 
part of the IIIF spec. So far there is exactly one 
additional feature : The ability to apply halftone 
dithering 



filter https : //github. com/ thisisaaronland/go- 
iiif#dithering to images. Adding new features is 
possible but it's not elegant so that's a thing that I'd like 
to have a think about going forward but at least it's 
been proven to work, now. It may not be obvious to most 
people but essentially what's happening here is that I 
am slowly re-implementing all of the logic and 
functionality used to generate images for the Cooper 
Hewitt collections 

website http: //labs .cooperhewitt.org/2013/b-is- 
f or-beta/ . Also, probably most of 
filtr https : //straup . github. io/filtr/ while I am 
at it... 

• It is written in Go http://goiang.org/ partly 
because the language lends itself to the problem and 
mostly because it allows for pre-compiled binary 
versions of all the go- 

iiif https : //thisisaaronland. github. io/go- 
iiif / command-line tools. This is not the reality 
today. All of the actual image processing is handled by 
the bimg https : //github . com/h2non/bimg/ 
package which is a wrapper for the libvips C 
library http: //www. vips . ecs . so ton . ac . uk / index, p 
hp?titie=vips . That still requires a degree of 
comfort and familiarity installing third-party 
dependencies which is far from ideal. When I started the 



gO-iiif https : //thisisaaronland . github. io/go- 
iiif / project half the work was simply an exercise to 
re-factor someone else's 

code https://github.com/greut/iiif in an effort 
to better understand what an IIIF implementation is 
supposed to do, and how, so I didn 7 see the benefit in 
also re -implementing everything in pure 
Go https://github.com/anthonynsimon/bild/ 

That is pretty high on the list of things to do next, as a 
second graphics "engine", which should allow for a 
standalone version ofgo- 

iiif https : //thisisaaronland. github. io/go- 
iiif / that doesn't require anything more than a 
simple download. While it may lack the performance of 
something like 

libvips http: //www. vips .ecs . soton. ac .uk/index.p 
hp?titie=vips it seems important that people in the 
museum community have something they can try and 
use with a minimum of fuss. Even if it's only to generate 

giant bags of pre-cached 

tiles https : //github . com/thisisaaronland/go- 
iiif /tree/ gh- 

pages /tiles/ 1845 12_5f 7f47e5b3c66207_x. jpg for 
slippymaps . That would be progress, I think. 

• There is sample code for capturing 

screenshots https : //github. com/thisisaaronland/ 



go-iiif /blob/gh-pages/ javascript/go-iiif . js of 
whatever is currently in the viewport of a tiled-image 
slippymap and saving that image to your computer. This 
doesn't really have anything to do with go-iiif. It's all 
front-end code that 

Other https : //github.com/mapbox/leaf let-image 
people https : //github . com/eligrey/FileSaver . js/ 
have written but it's still pretty cool. 

The performance of go- 

iiif http s : //github. com/thisisaaronland/ go-iiif is best 
described as pretty fast to very fast. Generating tiles offline, the 
bottlenecks are CPU usage and disk I/O with the potential of Go 
making eiher of those thing worse by trying to do too many 
things https : //divan . github . io/posts/go_concurrency_visuali 
ze/ at once. The performace and load testing 

doCS https : // github.com/thisisaaronland/go- 
iiif#performance-and-load-testing go on to say: 



[0]n a machine with 8 CPUs and 32GB RAM I was 
able to run the machine hot with all the CPUs 
pegged at 100% usage and seed 100, 000 (2048x 
pixel) images yielding a little over 3 million, or 
approximately 70GB of, tiles in around 24 hours. 
Some meaningful but not overwhelming amount of 
time was spent fetching source images across the 
network so presumably things would be faster 
reading from a local filesystem. Memory usage 
across all the iiif-tile-seed processes never went 
above 5GB and, in the end, I ran out of inodes. 


So, it's a start. I still have a number of questions about IIIF 
and pretty serious concerns about ever running a public IIIF server 
(even this one) in front of a general audience of strangers-on-the- 
internet so it's not perfect. But, all in all, it feels better than 
yesterday. 

If nothing else it will be useful for Parallel 
Flickr http: //www. aaronland . info/weblog/2 012 / 02 / 14 /incent 
ivize/#pda2012 


2016 - 09-18 
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Venues, Postal Codes... and All Those GitHub... 


Venues, Postal Codes... and 
All Those GitHub 
Repositories 



This post was originally published on the Mapzen 
weblog https://mapzen.com/blog/whosonfirst-venues/ , in 
October 2016. 


Venues 



tl;dr - 20 million openly licensed venues, all with full (or at 
least partial) Who's On First 

hierarchies https : //whosonf irst .mapzen . com/spelunker/place 
types /venue . 


What is this thing you call "venue" ? 

By default, Who's On First strives to group places in to as 

few distinct place types as 

possible https : //github. com/whosonf irst/whosonf irst- 
piacetypes . The goal is to gather places in to buckets where they 
have more in common with other kinds of places than not. Initially 
we defined venues as: 


Things with four walls and a ceiling. 


Then we remembered that we want to (eventually) add all 

the landmarks in the 

world https : //whosonf irst .mapzen . com/ s pelunker /tags/ publ 
ic art/ to Who's On First. Rather than create a new placetype 
specific for landmarks we decided it made sense to classify them as 
venues, and to indicate their landmark- 

ineSS http://www.wikilovesmonuments.org/ via a Mapzen- 
prehxed property. 

Which meant we needed to change the definition of a venue 
to reflect many landmarks lack of walls or ceilings. Now we define a 


venue as: 



Places that people might stand around, together. 


That may sound a little silly in the telling. You could find lots 
of fancier ways of saying the same thing but at the end of the day 
you'd still be... saying the same thing and in a way that would 
probably sound like gibberish to non-experts and strip away both the 
ambiguity and the play inherent in the reality of things. 


So, that's what we think a venue 
"is" https : //www. youtube. com/watch?v=j4XT-l-_3yO 


Venues are still the holy grail of geo data, precisely because 
they are the psychic underwear to our daily lives. For good or bad, 
most people's entire lives are mediated through "venues" be they 
commercial, cultural or social. There are a lot of venues, even in the 
smallest of communities. 


Multiply a lot of venues, even in the smallest of communities 
by the entire planet and you've got... well, a lot of venues. 

Which makes venues a hard problem. It's a hard problem 
collecting all those venues. It's a hard problem vetting them and to 
keeping them up to date. It's a hard problem to figure out where to 



put them all and harder still to figure out how to make that many 
things searchable. These are hard problems and that has meant only 
companies with an interest in reselling access to that data have 
endeavoured to take up the challenge. That's not very interesting to 
Who's On First. 



Historically, there haven't been a whole lot venue databases 
that don't impose constraints, of one kind or another, on their re-use 
The goal for Who's On First is to have an openly licensed database 
of places that may be used in a commercial project, without any 
additional restrictions. Something akin to an 
MIT https://en.wikipedia.org/wiki/MIT_License or 


BSD https://en.wikipedia.org/wiki/BSD_licenses software 
license but for data. That is not everyone's definition of "open" and 
that's okay. There are different flavours of "open" and different 
reasons for choosing one over another. 

Who's On First has always been a wildly ambitious 
project https://mapzen.com/blog/mapping-with-bias/ SO 
rather than be discouraged by something as big and hairy and 
complicated as venues, we chose instead to jump in with both feet. 

SimpleGeo (2009-2011) 

In 2011 , the now defunct geo-services company 
SimpleGeo https://www.twitter.com/simplegeo published 
their Public Spaces 

Collection https : //archive. org/details/2011-08-SimpleGeo- 
cco -Public -Spaces , 20 million business listings, as a Creative 
Commons 

Zero https : //creativecommons . org/publicdomain/ zero/ 1.0/ 

(CCO) dataset. Thanks, 

SimpleGeo! https : / /web . archive . or g /web/ 2 01 104 24 0907 05 /ht 
tp: //blog. simplegeo . com/ 2 011/04 /2 0 /open-places -data/ 


Most venues are in the 

USA https : //whosonf irst .mapzen. com/spelunker/placetype/ 
venue? iso=us (approximately 12 million of them) but other 



countries with a not-insignificant number of venues include the 

UK https : //whosonfirst .mapzen.com/spelunker/placetype/ve 
nue?iso=gb , 

Germany https : //whosonfirst .mapzen . com/spelunker/placet 
ype/venue?iso=de , 

Canada https : //whosonfirst .mapzen . com/spelunker/placetyp 
e/venue?iso=ca , 

Spain https : //whosonfirst .mapzen.com/spelunker/placetype/ 
venue? iso=es and 

Australia https : //whosonfirst .mapzen.com/spelunker/placet 
ype /venue ?iso=au . There are still many, many places in the world 
where the SimpleGeo dataset doesn't have any venue data so this is a 
just a beginning, and not a triumphal celebration on arrival. You can 
see a complete count of venues by country here: 

https://github.com/whosonfirst-data/whosonfirst-data- 
venue/b lob/master /DATA.md https : //github.com/whosonfirst 
-data/whosonf irst-data-venue /blob/master /DATA. md 


In 2015, we started importing those venues in to Who's On 
First. The import process has been a start-and-stop affair and adding 
a little over 8 million venues, last year, we put things on hold and 
didn't pick them up again until ealier this year. That work 
wrapped https : //github . com/whosonf irst-data/whosonf irst- 
data-venue-us/issues/1 

up https : //github. com/whosonf irst-data/whosonf irst-data- 



venue/ is sues /l , recently, meaning every one of those 20 millions 

venues has been assigned their very own Who's On First 

ID https : //mapzen . com/blog/wof -lifecycle-documentation/ 

and a full set of ancestors within Who's On 

First https : // github.com/whosonfirst/whosonfirst- 

placetypes#here-is-a-pretty-picture 


Aside from a general desire to add all those venues in to 
Who's On First the sheer volume of data was attractive as a way to 
test some of the assumptions and infrastructure we've developed to 
manage all these places https : / /mapzen . com/blog/who-s-on- 
first/ we've taken under our wing. Any comprehensive database 
of venues in Who's On First, whether we seeded it using SimpleGeo 
or another source, would have all the same problems of scale so even 
though there were some initial concerns about the quality and the 
focus of the Public Spaces Collection we figured "Why not at least 
use it to start testing things, now?" 




The SimpleGeo data was released in 2011 and it's 2016, 
today, so it's safe to assume that we will not have a venue record for 
that cool new bar which just opened in 

London https : //whosonf irst .mapzen . com/spelunker/lD/desce 
ndants&placetype=venue 01' Los 

Angeles https : //whosonf irst .mapzen . com/spelunker/ID/desce 
ndants&placetype=venue or 

Shanghai https : //whosonf irst .mapzen . com/spelunker/lD/des 
cendants&piacetype=venue . We aspire to have those venues, and 
aspire to have them in something approaching real-time, but today 
we do not. 



We might actually have some of those cool new bars in New 
York City or San Francisco but that is largely a function of choosing 
those two cities to start testing the Who's On First editorial 

Stack https : //mapzen.com/blog/boundary-issues-properties 
with, since that is where most of the Mapzen staff live. This coastal 
bias is not a feature of Who's On First, only reflective of the fact that 
it is still early days... 


On the other hand if you need a plumber or a notary in any of 
those cities, or places like rural Kansas, it's entirely possible we have 
those 

listings https : //whosonf irst .mapzen . com/spelunker/search/? 
q=plumber&region_id=85 688555 

That's the interesting thing about the term "venues", in 2016: 
It has become short-hand for a very specific subset of venues, 
eclipsing everything else. Often when people say "venues" they're 
really talking about restaurants and bars in urban or metropolitain 
areas and shops where people arguably enjoy spending money on the 
things they're buying (as opposed to, say, picking up toilet paper at 
the drug store). 

These venues are also almost always current venues 
suggesting the thrill, however illusory, of something that hasn't been 
discovered by everyone else 

yet http : / /www . aaronland . inf o/ weblog/ 2 014/10/06/ interpret 



ation/#brick and aimed very specifically at what marketers refer 
to as "the 18-35 year old" demographic. 



But let's stop for a moment and remember a few things: Let's 
remember the nearly 100-year old butcher 

shop https : //whosonfirst .mapzen.com/spelunker/id/572151977 

/ in your neighbourhood. Or the local 

bar https : //whosonfirst .map zen . com/spelunker/id/572077043/ 
that closed last year whose history runs 

deeper https : //en. wikipedia.org/wiki/The_Lexington_Club 
than a pint glass. Let's remember our new best friend Poop Emoji 



Rock https : //whosonf irst .mapzen . com/s pelunker/id/ 10081840 
1 / . 


As important as it is, there is more to life than tomorrow's 
happy hour. Plus, sooner or later, everyone gets ejected from 18-35 
Year Old Club http : //www. aaronland.net/old-club/ and 
then what? 


As we've been working with the SimpleGeo dataset it's 
become clear that despite some bunk data and some unfortunate 
absences if there was a venue that existed before 2011 which had any 
kind of following, or traffic, it's probably in there... somewhere. 


Now that we've gotten over the initial hurdle of simply 
importing all those venues (more on that below) a big part of our 
work in the short-term is to think about how we can allow people to 
make sense of all that stuff. It's easy to complain that there are no 
open venues in the world but then to be confronted by 86, 000 
accountants in seven different 

Countries https : //whosonfirst .mapzen.com/spelunker/placet 
ypes/venue/?tag=accountant can be equally disorienting. 


Some of the work going forward will involve Yes-No-Fix 
Style tools https://mapzen.com/blog/yesnofix allowing 
people to quickly flag venues as being still-open or long-since-closed 
and some of the work will involve grouping venues in to 



Categories https : //github . com/whosonf irst/whosonf irst- 
categories for directed searches and general browsing. Some of 
the work will simply be learning by building things and seeing what 
works and what doesn't https: / /WWW. you tube.com/watch? 
v=XnoqLt Jb jcl 


It's not going to be perfect right away but if we do the job 
well then, at the very least, things will be better than they were 
yesterday. 


In the meantime there are 20, 848, 132 (and counting) venues 
that have been Who's On First - 

ifed https : / /who son f irst .mapzen . com/spelunker/placetypes/ve 
nue 


Postal Codes 




tl;dr - 3 million openly licensed postal codes with centroids 


or 

polygons https : //whosonfirst .mapzen.com/spelunker/placety 
pes/postalcode/?exclude=nullisland in 14 countries. 


In the same spirit as venues we've also started to tackle 
adding postal codes to Who's On First. The initial import of postal 
codes came courtesy the 7.10.0 release of the Yahoo! GeoPlanet 
dataset https : //archive . org/details/geoplanet_data_7 . 10 . 0 . z 
ip , which contains postal codes for most countries, even if they 
don't include geometries. That's okay because each postal code does 
have a parent 



ID https : //whosonf irst .mapzen . com/ spelunker/ concordances 
/geoplanet/ and that gives us enough information to group things 
by country and we can add geometries, opportunistically, one 
country at a time. 


There are more than 14 countries in the world so there is a lot 
of work left to do but here are some of the highlights, to date: 

Thank you, Ordnance Survey! 

We've imported all of the UK Ordance Survey's Code-Point 
Open https : //www. ordnance survey . co . uk/business-and- 
government /products /code-point-open . html dataset in to 
Who's On First and we're still trying to sort out why there are a 
healthy 300, 000 UK postal codes without geometries. On the other 
hand there are 1.6 million postal codes with 

Centroids https : //whosonfirst .mapzen.com/spelunker/placet 
ypes/postalcode/?iso=gb&exclude=nullisland SO that's 

progress. 

Thank you, Hannes and Kevin! 

Thanks to Hannes 

Junnila https://github.com/hannesj and Mapzen' s own 
Kevin Kreiser https://twitter.com/kevinkreiser we have 
imported official postal codes for 

Finland https : //github. com/whosonf irst-data/whosonf irst- 



data-postalcode-f i/issues/1 and 

Austria https : //github.com/whosonfirst-data/whosonfirst- 
data-postalcode-at/issues/1 and 
Switzerland https : // github . com/whosonf irst- 
data/whosonf irst-data-postalcode-ch/issues/1 


Thank you, America and Canadia! 

Likewise we've imported the US Census ZIP Code 
Tabulation 

Areas https : //WWW. census . gov/ geo/reference/ zctas . html 
meaning we have geometries for around 75 % of all the postal 
Codes https : //whosonf irst .mapzen . com/ spelunker/ pi ace types 
/postalcode/?iso=us&exclude=nullisland in the USA. 


North of the US border, it turns out that Canada has about 

800, 000 six-character postal 

Codes https : //whosonf irst .mapzen . com/ spelunker /pi ace types 
/postaicode/?iso=ca . The licensing around their geometries has 

a long and tortured 

history http: //www.michaelgeist.ca/2016/06/crowdsourcedpo 
staicode lawsuit/ . Statistics Canada, however, has been nice 
enough to publish shapefiles for the 1, 621 Forward Sortation 
Areas (FSA) http: //wwwl2 . statcan. gc .ca/census- 
recensement/2 0 11 /geo/ bound- limit /bound-limit-2 01 1- 



eng . cfm which are represented by the first three characters in a 
Canadian postal code. 


Using that information we generated approximate centroids 
for the all the six-character postal 

Codes https : //whosonf irst .mapzen . com/ spelunker/ pi ace types 
/postalcode/?iso=ca&exclude=nullisland derived from the 
geometric center of their parent FSA. This is not Perfect Data but it 
does scope the problem to about a square kilometer rather than an 

entire 

province https : //en .wikipedia . org/wiki/Postal_codes_in_Can 
da . 


Thank you, Open Addresses! 

Using the 

Clustr https://github.com/whosonfirst/Clustr tool 
originally developed to extract shapefiles from geotagged Flickr 
photos http: //code . flickr . net/ 2 008/ 10/30 /the- shape-of- 
aipha/ we've been able to generate approximate geometries for 
postal codes derived from Open 
Addresses http://openaddresses.io/ data for 
Australia https : //github . com/whosonf irst- 
data/whosonf irst-data-postalcode-au/ , the 
Netherlands https : //github . com/whosonf irst- 
data/whosonf irst-data-postalcode-nl/ and 



France https : //github . com/whosonf irst-data/whosonf irst- 
data-postalcode-f r/ 


These geometries are very much Weird Data so it is left up to 
you to decide whether you think they're better than no data at all. If 
nothing else we think having to option to make that choice is better 
than no data. 


All Those GitHub Repositories 



tl;dr - We are betting on the future and making do with the 
present https://github.com/whosonfirst-data . 

Recently we created a second organization in GitHub for 
Who's On First related work. The first organization is called 
whosonfirst https://github.com/whosonfirst and houses all 
of the software we've written to date as well as a growing body of 
theory and 

documentation https : //github.com/whosonfirst/whosonfirst 
-cookbook . 


The second organization is called whosonfirst- 
data https://github.com/whosonfirst-data and it is where 
the millions and millions of GeoJSON hies representing all of the 
places https://mapzen.com/blog/all-of-the-places/ in 
Who's On First live. The decision for a second organization was 
spurred on by the work we've been doing to import venues and 
postal codes. 

Work that has translated in to 24 million GeoJSON files 
spread across 488 GitHub repositories . Before we go any further, 
let's be clear about one thing: This is not an ideal situation. 


Ultimately our goal is to have a single monolithic Who's On 
First repository that will contain all 24 million (and counting) 
records. In 2016 storing 24 million tiny hies in a single Git 



repository is either technically impossible or so impractical as to 
"play impossible on TV". 


Until that better day comes when a single "mono-repo" is 
possible we have been working instead to establish conventions for 
what repository a given hie lives in and what that repository is 
called. The naming conventions for repositories at their most 
granular are as follows: 


"whosonf irst-data-" + 
WHOSONFIRST_PLACETYPE + + 

WHOSONF IRST_COUNTRY_CODE + + 

WHOSONFIRST SUBDIVISION CODE 


For example: 

• wllOSOnfirst-data https://github.com/whosonfirst- 
data/whosonfirst-data — administrative data 

(continents - 

microhoods https: //github.com/whosonf irst/whos 
onf irst-placetypes ) for the world 

• whosonfirst-data-venue-us- 

ca https : //github.com/whosonfirst- 
data/whosonf irst-data-venue-us-ca — venues in 
California, USA 



• whosonfirst-data-venue- 

ca https : //github.com/whosonfirst- 
data/whosonf irst-venue-ca — venues in Canada 


• whosonfirst-data-postalcode- 

fi https : //github . com/whosonf irst- 
data/whosonf irst-data-postalcode-f i — 
postalcodes in Finland 

The first thing to note is that not all repositories are as 
granular as the rules described above. 


Wherever feasible we try to bundle records with the least 
amount of granularity as possible. For example postalcodes are 
grouped by country as are venues unless there are so many of them, 
like in the USA, that it is not practical to keep them in a single parent 
repository. 

If a repository grows so much data that it is no longer 
practical to keep everything in one place then it may be subdivided 
in to a number of child repositories. Venues are a good example of 
this. 


We try to maintain a separate "parent" repository for things 
that have been broken out in to multiple child repositories. For 
example there is a whosonfirst-data- 



postalcode https : //github. com/whosonf irst- 
data/whosonf irst-data-postaicode repository that contains no 
data but instead a pointer https://github.com/whosonfirst- 
data/whosonf irst-data-postalcode/blob/master/data. json to 
all the repositories that do have postalcode data. We also do the same 
for venues in the USA https: //github. com/whosonf irst- 
data/whosonf irst-data-venue-us 


The whosonfirst-data https: / /github. com/whosonf irst- 
data/whosonfirst-data repository is the obvious exception (or 
perfect example, depending on how you look at it) to the scenario 
described above. This repository contains all "administrative" 
placetypes (all the places between and inclusive of continents to 
microhoods) for the entire world. While it is possible to imagine that 
the sum total of all the neighbourhoods in the world will require 
putting them in a separate repository but we are going to hold off 
doing that for as long as we can. 





Can a stretch of land be a person in the eyes of the 
law? Can a body of water? In New Zealand, they can. A 
former national park has been granted personhood, and a 
river system Is expected to receive the same soon. 


http: //www. ny times . com/2016/ 07 /14 /world/wha 

fiArp^imWthfTm-m}^^^^-oiogizem 

rivers-can-be-people-legally-speaking . html 
advance if it's bumpy, weirct or incomplete. 



The goal is to provide a clear and reproducible template for 
subdividing a repository (that has grown too large) in such as way 
that all those repositories may eventually be merged in to a single 
collection without any conflicts. Who's On First documents are 
grouped by country and placetype as a convenience but, technically, 
they can live anywhere. 


Finally, Who's On First records should always have a 
wof : repo property indicating the repository to which they belong. 
If they don't that's a bug. 


Huzzah! 



Images used in this blog post: 


• Entertaining guests at the California alligator farm , 
Los 

Angeles https : //www. f lickr . com/photos / chs_comm 
ons/16356193176/ California Historical Society 

• Franzi and cable 

car https : //www. f lickr . com/photos/philgyf ord/1 
4251596747/ Phil Gy ford 

• Postcard of the sea 

serpent https : //www. f lickr . com/photos /nantucket 
historicalassociation/31774 8557 2/ Nantucket 
Historical Association 

• Turtle club, caught between San Pedro & Catalina 
Island, 

Cal. https : / /www. f lickr . com/photos / chs_commons 
/15485394421/ California Historical Society 

• Sitting with travel trailer at St. Petersburg 

Beach https : //www. flickr.com/photos/floridamem 
ory/14889402267/ State Library and Archives of Florida 

• The Endless 

Staircase https : //www. f lickr . com/photos/benterr 



ett 79785396672/ Ben Terrett 


2016 - 10-07 





see also: 


I kind of hope I just write the same email to... 


I kind of hope I just write the 
same email to you every 
year... 



Last year, following the MCN 

2015 http : //www. aaronland. inf o /weblog/ 2 015/1 1/09/keinhol 
z/#mcn conference I sent an email to the Cooper Hewitt telling the 
story of what I'd 

Said http: //www. aaronland. info/weblog/2015/12/31/belief /# 
epilogue to people when asked "what it meant" that both Seb 
Chan http://www.freshandnew.org/ and I had both left the 
museum so soon after launching the Pen 


Earlier this year, following the launch of the London 
Biennale http://www.londondesignbiennale.com/ , during 
which the Cooper Hewitt traveled not just the museum's wallpaper 
room but also the Pen and all its related visit 

technology https : //londonbiennale . cooperhewitt . org/ about/ 

I sent another email to the museum. It was titled "7 kind of hope I just 
write the same email to you every year..." 


This is an annotated version of what I said: 



I wanted to take a moment to send a note and say 
congratulations on the London Biennale launch! 


It goes without saying that I am not so far away 

from the 

Pen http : / /www. aaronland . inf o /web log/ 2 016/0 
3/ 09/osha/#bespokiness (and its relations) that I 
don't have something like a vested interest in all of 
this stuff and remain eager to see it succeed. 


On the other hand aside from answering questions 
and the occasional technology-therapy session it is 
completely out of my hands now, which makes it that 
much more exciting to see the work grow and 
evolve. 


This is how it should be. 



I don't expect that the London Biennale will gamer 
the attention or the praise that it should, though. 


Unfortunately I think that's just part of a larger 
dynamic in the museum sector - a perverse mix of 
tall-poppy syndrome and a bad habit of 
compensating for the future with shiny things 
meaning that some people will only see "last year's 
project" - and not reflective of the work that the CH 
did. 


You're not supposed to say things like that out loud, are you? 


But I think it is work that is a big deal, and the 
museum shouldn't be shy in telling people about it. 


I think it's a big deal because the museum was able 
to, in no particular order: 

* Adapt and modify third-party work (the Local 
Projects http: / /localpro jects .net/project/co 
oper-hewitt-smithsonian-de sign-museum/ 

application code for the wallpaper room). 



Even just as a thought experiment, it would be good to keep 
an inventory of times this has actually ever happened in the museum 
sector. Who are the institutions, and what are the projects, where the 
warranty label on a third-party deliverable has been ripped off (or 
even just removed after its expiry date) in order to look inside the 
box and make it sing another tune? 


I would gladly be proven wrong on this but I think that list, in 
2016 , would be pretty short. 


* Adapt and modify and distill its own work and 
essentially figure out what parts of the collection 
website and visit infrastucture could be re-purposed. 


There is this weird idea that in order for a project to succeed 
it must do everything out of the gate and in one coherent package. 
This is not just implausible it's also bad engineering practice to have 
something so tightly integrated that it can't be disassembled and 
reconfigured in to something new. 


The alternative, in real terms, is a whole lot of files and even 
more blocks of code being cloned from one project to another. That's 
okay. There is time to refactor everything in to shared libraries, 
assuming it even works. 



What's important in the work the museum did for London is 
that when we launched the Pen, in 2015, there was absolutely no 
attention paid to making something that could be repackaged in a 
white -box Pen system for another institution. None. It was too soon 
and we needed to make sure that things worked in our museum 
before trying to understand what it meant to work in someone else's 
museum. 

In 2016, the Cooper Hewitt is probably about 70 to 75% of 
the way to having something that could be used with minimal fuss 
by another museum. That's a big deal on the face of it and an even 
bigger deal because it's all work that grew organically out of efforts 
that were done, in-house, for the re-opening. 



* Not make any substantial changes to the Pen itself. 
No one else will appreciate what that means but 
those of us that do... 


* Integrate a brand new third-party contractor (Dan 
Catt http://revdancatt.com/ ) and have them 
work successfully with all of the scaffolding that was 
originally developed not just for Local Projects but 
for everyone who came after them. 


* Demonstrate that the technology is not so tightly 
integrated to the museum itself that it is essentially a 
glorified installation piece. 


I am not going to comment on every bullet point so just take 
a moment to re-read those last three items . If you doubt my 
comments about the Pen, then read 

this http: //www. aaronland. info/weblog/2016/03/09/osha/#be 
spokiness 



I think it's clear that my bias has always been about 
getting the Pen on to the Mall (no one can say it's not 
possible now, only that it would be hard) but even if 
that didn't happen you know enough to send out 
modified versions of the (84") collections app tables 
to other museums, SI or not, now. 


I am going to repeat this one because it's important to me: No 
one can say that it is impossible to deploy the Pen to all the 
Smithsonian museums on the Mall in Washington , now. 

It would be challenging but most of those challenges are at 
the visitor 

services http: //www.aaronland. info/weblog/2015/12/31/belie 
f /#design-eagie layer and to an equal or lesser degree about 
changing internal cultural practices. Both of those are problems in 
the museum sector long overdue for some attention, anyway. The 
really hard problems are not technology-related nor are they, if you 
imagine operating at the scale of the entire Smithsonian, financial. 
The hard problems are elsewhere and they are worth doing. 


In the meantime, can you imagine how mind-bendingly 
awesome it would be to go to the Mall and be able to come away 
with a permanent record of all the things you collected at each of the 
Smithsonian museums you visited? 



Seriously 

amazing http : / /WWW. aaronland . info/weblog/2 0 12 / 10/23 /call 
back/#otaku , even. 


We are probably still a few years away from Seb's 
dream of screens integrated in to display cases (this 
is where Seb would start pounding the table saying 
there is already some museum in Holland doing it 
today, which doesn't make him right) but ability to 
travel the collection 

tables http : / /www. theverge . com/2 015/3/1 1/ 818 
2051/ smithsonian-cooper-hewitt-design- 
museum-reopening-pen-4k and all they make 
possible get the sector a whole lot closer in the 
meantime. 


* Portable immersion room! ! ! ! ! 


It is no longer crazy talk to imagine setting up a 
clone of the immersion room with, say, all the 
textiles during an event like Fashion Week. 


I was unaware at the time I wrote this that it was, in fact, 
Fashion Week in New York City. 



The museum has not traveled the 84-inch "Collections App" 
tables anywhere yet. The point is that having now reconfigured and 
traveled the smaller 55-inch "Wallpaper App" tables they can and 
more importantly they know they can. 


And like the email I sent last year, I think the best 
part is that all of this was done by staff all while they 
were doing all the other things they do at the 
museum. 


It is even more impressive given the unfortunate 
timing of [redacted] . Micah and his 
team https://labs.cooperhewitt.org/ should 
be applauded for what they did, given the time and 
the constraints. Hugged, even. 


This is also important. None of this stuff works because there 
was something called "a successul vl launch". They work and they 
happen because there are people making them happen. 


There is no way to do the kinds of things that the Cooper 
Hewitt has done, and continues to be capable of, without staff. It 
doesn't have to be big staff. It simply needs to be a group of people 
who work well together and who have sufficient freedom and 



autonomy (and the corresponding responsibility) to imagine and 
prove their ideas to the rest of the museum. 


That is true of almost any department in any organization so 
the salient point is not that the people working on digital are special, 
only that they are not programmable 

toasters http://idlewords.com/talks/deep_fried_data.htm . 


(Also Matt O'Connor because... well, portable 
immersion room! ! ! !) 


Did I mention that the museum can now tour the wallpaper 
room http: //www. cooperhewitt.org/events /current- 
exhibitions /immersion- room/ , arguably the most popular 
exhibition since the re-opening? Matt did that. 


I imagine there was a non-trivial capital cost to the 
Biennale and I gather everyone aged a few extra 
years for every week leading up to the opening. 
We're still not at a stage where any of this is 
necessarily easy or cheap, yet. 



The goal is to make "simple things easy and hard things 
possible" not to groom magic ponies. 


But again, like last year, when you look at what's 
been accomplished and you imagine the time and 
cost it would take for a traditional client-service firm 
(or firms) to do the same it's pretty easy to imagine 
an impossible budget. One that would have 
prevented the work that went to London from ever 
happening. 


If you think I am joking read these two blog posts side-by- 
side — Micah Walter's run-through of the entire London Biennale 
project http: //labs .cooperhewitt.org/ 2016 /traveling-our- 
technoiogy-to-the-u-k/ and Lisa Adang's post-mortem of 
developing a brand new Pen integration for the Process 

Lab http: //labs .cooperhewitt.org/ 2016 /process-lab- 
cit i zen-des igner-digit al- interact ive-design-case-study/ 

— and consider that both of these projects were happening at the 
same time . 



When you factor in the costs of outsourcing to 
include (probably) not being able to again piggy- 
back on the work done for London for the next thing 
then it starts to put things in perspective. 


I also know those are just fancy words when you're 
looking at the reality of a spreadsheet so I wanted to 
say thank you for continuing to take leap after leap 
of faith in to... well, so far it's worked out pretty well 
hasn't it? 


This is how it should be. 


2016 - 11-04 




